When creating DataFrame.withcolumn(),Spark dev team forgot to check it that column name is already in use.
In the beginning:
val res = sqlContext.sql("select * from tag.tablename where dt>20150501 limit 1").withColumnRenamed("tablename","tablename")
res.columns
shows:
res6: Array[String] = Array(user_id, service_type_id, tablename, dt)
then
val res1 = res.withColumn("tablename",res("tablename")+1)
res1.columns
shows:
res7: Array[String] = Array(user_id, service_type_id, tablename, dt, tablename)
By the way, res1.show works.
BUG begins here:
res1.select("tablename")
org.apache.spark.sql.AnalysisException: Ambiguous references to tablename: (tablename#48,List()),(tablename#53,List());
dataframe withcolumn. (Actually, it rang a bell because I noticed the checkin fixing it some time ago.)