1

This is the table structure for the seven tables I'm trying to join into just one:

-- tables: en, fr, de, zh_cn, es, ru, pt_br `geoname_id` INT (11), `continent_code` VARCHAR (200), `continent_name` VARCHAR (200), `country_iso_code` VARCHAR (200), `country_name` VARCHAR (200), `subdivision_1_name` VARCHAR (200), `subdivision_2_name` VARCHAR (200), `city_name` VARCHAR (200), `time_zone` VARCHAR (200) 

And this is the new table structure, where all data will be stored:

CREATE TABLE `geo_lists` ( `city_id` int (11), -- en.geoname_id (same for all 7 tables) `continent_code` varchar (2), -- en.continent_code (same for all 7 tables) `continent_name` varchar (200), -- en.continent_name (just in english) `country_code` varchar (2), -- en.country_iso_code (same for all 7 tables) `en_country_name` varchar (200), -- en.country_name `fr_country_name` varchar (200), -- fr.country_name `de_country_name` varchar (200), -- de.country_name `zh_country_name` varchar (200), -- zh_cn.country_name `es_country_name` varchar (200), -- es.country_name `ru_country_name` varchar (200), -- ru.country_name `pt_country_name` varchar (200), -- pt_br.country_name `en_state_name` varchar (200), -- en.subdivision_1_name `fr_state_name` varchar (200), -- fr.subdivision_1_name `de_state_name` varchar (200), -- de.subdivision_1_name `zh_state_name` varchar (200), -- zh_cn.subdivision_1_name `es_state_name` varchar (200), -- es.subdivision_1_name `ru_state_name` varchar (200), -- ru.subdivision_1_name `pt_state_name` varchar (200), -- pt_br.subdivision_1_name `en_province_name` varchar (200), -- en.subdivision_2_name `fr_province_name` varchar (200), -- fr.subdivision_2_name `de_province_name` varchar (200), -- de.subdivision_2_name `zh_province_name` varchar (200), -- zh_cn.subdivision_2_name `es_province_name` varchar (200), -- es.subdivision_2_name `ru_province_name` varchar (200), -- ru.subdivision_2_name `pt_province_name` varchar (200), -- pt_br.subdivision_2_name `en_city_name` varchar (200), -- en.city_name `fr_city_name` varchar (200), -- fr.city_name `de_city_name` varchar (200), -- de.city_name `zh_city_name` varchar (200), -- zh_cn.city_name `es_city_name` varchar (200), -- es.city_name `ru_city_name` varchar (200), -- ru.city_name `pt_city_name` varchar (200), -- pt_br.city_name `time_zone` varchar (30) -- en.time_zone (same for all 7 tables) ); 

I'd like to join them all, using the locale (language) code as prefix for the column names.

5
  • Do you have all that data in all those languages?? Where did you get it??? Commented Jun 27, 2015 at 20:05
  • 1
    from here dev.maxmind.com/geoip/geoip2/geolite2 Commented Jun 27, 2015 at 20:06
  • Are you trying to insert or update? Sample data and desired results are very helpful, as is a SQL Fiddle. Commented Jun 27, 2015 at 20:56
  • @GordonLinoff I want to insert into an empty table (geo_lists). What I want is to join the 7 tables into one, adding the country code as a prefix in the table name. I'm messed up... I just added the table structure Commented Jun 27, 2015 at 21:25
  • @GordonLinoff Any idea why LuisTeijon's query it's not working? Any other suggestion? Commented Jun 28, 2015 at 12:56

5 Answers 5

2

Oh! @GabrielBlanca you are right, in that case try this query and let my know if it worked. You can copy and paste:

insert into geo_lists -- columns (city_id, continent_code, continent_name, country_code, time_zone, en_country_name, fr_country_name, de_country_name, zh_country_name, es_country_name, ru_country_name, pt_country_name, en_state_name, fr_state_name, de_state_name, zh_state_name, es_state_name, ru_state_name, pt_state_name, en_province_name, fr_province_name, de_province_name, zh_province_name, es_province_name, ru_province_name, pt_province_name, en_city_name, fr_city_name, de_city_name, zh_city_name, es_city_name, ru_city_name, pt_city_name) -- end columns select en.city_id, en.continent_code, en.continent_name, en.country_code, en.time_zone, en.country_name as en_country_name, fr.country_name as fr_country_name, de.country_name as de_country_name, zh.country_name as zh_country_name, es.country_name as es_country_name, ru.country_name as ru_country_name, pt.country_name as pt_country_name, en.state_name as en_state_name, fr.state_name as fr_state_name, de.state_name as de_state_name, zh.state_name as zh_state_name, es.state_name as es_state_name, ru.state_name as ru_state_name, pt.state_name as pt_state_name, en.province_name as en_province_name, fr.province_name as fr_province_name, de.province_name as de_province_name, zh.province_name as zh_province_name, es.province_name as es_province_name, ru.province_name as ru_province_name, pt.province_name as pt_province_name, en.city_name as en_city_name, fr.city_name as fr_city_name, de.city_name as de_city_name, zh.city_name as zh_city_name, es.city_name as es_city_name, ru.city_name as ru_city_name, pt.city_name as pt_city_name from en, fr, de, zh_cn, es, ru, pt_br where en.city_id = fr.city_id and fr.city_id = de.city_id and de.city_id = zh_cn.city_id and zh_cn.city_id = es.city_id and es.city_id = ru.city_id and ru.city_id = pt_br.city_id 
Sign up to request clarification or add additional context in comments.

3 Comments

Yes! Something like that! With some corrections hehehe :) But it's taking forever! The mysqld process is using 20-40% cpu, but it won't finish... The 7 csv that I'm using are just 42mb. So I don't know why it's taking so long... :(
@LuisTeijon You know why it's not working? There are 80k rows on each table (7x80k) but most columns (not from 'en' column) are empty, so could it take more than 2 hours? I've got a decent processor (i5-4670k).
Ok the query it's working if I limit the select to 10 or 1000. But there are a lot of rows... how can I optimice that query?
2

Ok Gabriel where do we stand now with optimization.

  1. What have you accomplished with data load

  2. table structures with indexes

  3. row counts

I will propose an answer here as I dragged you back here from a duplicate question you willingly deleted (thx)

8 Comments

I never studied never of this, so I'm not really sure what are those things like indexes, etc... Do I have to tell you something about those 3 points?
How about whatever you want to tell. I have clue where you are at right now
I'm totally lost, but if I had to say something... I don't know if it has something to do but I plan on using db storage alongside memcached so the data loaded into the DOM is faster, so the db will only be called the first time the data is requested. Anyways optimization is always an important factor plus I like things organized (who doesn't like it) and understandable. But I'm still not really sure what is the best structure for something like this; having localized data in the same table, geographical data (like countries, etc) can be, I don't see any problem.
Let's start with where you left off with the guys here. Did you get that UNION done?
Looks like Rhino has the indexes in place assuming his tables are like yours. Maybe you have done a UNION already. Add necessary indexes on the parts of joins and ordering
|
1

Try something like this:

 INSERT INTO new_table (continent_code, subdivision_1_name) SELECT en.continent_code , en.subdivision_1_name FROM en 

EDIT

 INSERT INTO new (cz_val,cz_value,en_val,en_value) SELECT cz.val, cz.value, en.val, en.value FROM cz INNER JOIN en ON en.id = cz.id GROUP BY cz.id 

7 Comments

Is it not possible to select them all?
No, becouse select must has same columns like columns in insert. But if you use some programming language, you can do script for it.
But there has to be a way to do it with mysql queries, the tables I'm trying to join and take data they all have the same structure. I need to join (literally) them into just one table, so I don't have 7 tables. The columns that have localized names I rename them adding the country code as prefix. Are you sure I can't do it with queries?
Ok, i am sorry, you can do it in one query. I look bad to you sql. You can join other tables ,but you dont forget when columns in insert must be same like columns in select. I thing number of columns. That is very important in this query.
Can you give me an example using two tables?
|
1

Try this:

INSERT INTO geo_lists SELECT * FROM en UNION SELECT * FROM fr UNION SELECT * FROM de UNION SELECT * FROM zh_cn UNION SELECT * FROM es UNION SELECT * FROM ru UNION SELECT * FROM pt_br UNION 

9 Comments

So I just replace the left join by the selects?
SELECTs are not replacing the LEFT JOINs... notice that I used UNION to join the tables... You can use UNION since all the tables has the same fields... This query has the format: INSERT INTO X (SELECT * FROM Y)
Ok, so I just add the values to your code or how can I specify what columns goes in the geo_lists table?
You can specify what fields you want to insert e.g. INSERT INTO geo_lists SELECT a,b FROM en UNION SELECT a,b FROM fr UNION etc...
@LuisTeijon I think he means the new table columns, I'll add the structure for the new column where all data will be. It seems I can't neither understand how to make it work... Added.
|
1

I also saw your duplicate question about query optimization. And I found the solution. The problem is that your tables don't have indexes. Just do:

ALTER TABLE en ADD PRIMARY KEY(geoname_id); ALTER TABLE fr ADD PRIMARY KEY(geoname_id); ALTER TABLE de ADD PRIMARY KEY(geoname_id); ALTER TABLE zh_cn ADD PRIMARY KEY(geoname_id); ALTER TABLE es ADD PRIMARY KEY(geoname_id); ALTER TABLE ru ADD PRIMARY KEY(geoname_id); ALTER TABLE pt_br ADD PRIMARY KEY(geoname_id); 

Then run:

DROP TABLE IF EXISTS geo_lists; CREATE TABLE `geo_lists` ( `city_id` int (11), -- en.geoname_id (same for all 7 tables) `continent_code` varchar (2), -- en.continent_code (same for all 7 tables) `continent_name` varchar (200), -- en.continent_name (just in english) `country_code` varchar (2), -- en.country_iso_code (same for all 7 tables) `en_country_name` varchar (200), -- en.country_name `fr_country_name` varchar (200), -- fr.country_name `de_country_name` varchar (200), -- de.country_name `zh_country_name` varchar (200), -- zh_cn.country_name `es_country_name` varchar (200), -- es.country_name `ru_country_name` varchar (200), -- ru.country_name `pt_country_name` varchar (200), -- pt_br.country_name `en_state_name` varchar (200), -- en.subdivision_1_name `fr_state_name` varchar (200), -- fr.subdivision_1_name `de_state_name` varchar (200), -- de.subdivision_1_name `zh_state_name` varchar (200), -- zh_cn.subdivision_1_name `es_state_name` varchar (200), -- es.subdivision_1_name `ru_state_name` varchar (200), -- ru.subdivision_1_name `pt_state_name` varchar (200), -- pt_br.subdivision_1_name `en_province_name` varchar (200), -- en.subdivision_2_name `fr_province_name` varchar (200), -- fr.subdivision_2_name `de_province_name` varchar (200), -- de.subdivision_2_name `zh_province_name` varchar (200), -- zh_cn.subdivision_2_name `es_province_name` varchar (200), -- es.subdivision_2_name `ru_province_name` varchar (200), -- ru.subdivision_2_name `pt_province_name` varchar (200), -- pt_br.subdivision_2_name `en_city_name` varchar (200), -- en.city_name `fr_city_name` varchar (200), -- fr.city_name `de_city_name` varchar (200), -- de.city_name `zh_city_name` varchar (200), -- zh_cn.city_name `es_city_name` varchar (200), -- es.city_name `ru_city_name` varchar (200), -- ru.city_name `pt_city_name` varchar (200), -- pt_br.city_name `time_zone` varchar (30) -- en.time_zone (same for all 7 tables) ); INSERT INTO geo_lists SELECT en.geoname_id, en.continent_code, en.continent_name, en.country_iso_code, en.country_name AS en_country_name, fr.country_name AS fr_country_name, de.country_name AS de_country_name, zh_cn.country_name AS zh_cn_country_name, es.country_name AS es_country_name, ru.country_name AS ru_country_name, pt_br.country_name AS pt_br_country_name, en.subdivision_1_name AS en_subdivision_1_name, fr.subdivision_1_name AS fr_subdivision_1_name, de.subdivision_1_name AS de_subdivision_1_name, zh_cn.subdivision_1_name AS zh_cn_subdivision_1_name, es.subdivision_1_name AS es_subdivision_1_name, ru.subdivision_1_name AS ru_subdivision_1_name, pt_br.subdivision_1_name AS pt_br_subdivision_1_name, en.subdivision_2_name AS en_subdivision_2_name, fr.subdivision_2_name AS fr_subdivision_2_name, de.subdivision_2_name AS de_subdivision_2_name, zh_cn.subdivision_2_name AS zh_cn_subdivision_2_name, es.subdivision_2_name AS es_subdivision_2_name, ru.subdivision_2_name AS ru_subdivision_2_name, pt_br.subdivision_2_name AS pt_br_subdivision_2_name, en.city_name AS en_city_name, fr.city_name AS fr_city_name, de.city_name AS de_city_name, zh_cn.city_name AS zh_cn_city_name, es.city_name AS es_city_name, ru.city_name AS ru_city_name, pt_br.city_name AS pt_br_city_name, en.time_zone FROM en INNER JOIN fr USING(geoname_id) INNER JOIN de USING(geoname_id) INNER JOIN zh_cn USING(geoname_id) INNER JOIN es USING(geoname_id) INNER JOIN ru USING(geoname_id) INNER JOIN pt_br USING(geoname_id); 

On my PC time of query is 8.3 sec. Good luck!

2 Comments

Lol: Execution/Total Time: 1.326 sec hehehe
Hmm I don't know what answer accept, Luis Teijon made the query... But thanks anyways for figuring out where the error was!