-
- Notifications
You must be signed in to change notification settings - Fork 19.4k
ENH: to_sql() add parameter "method" to control insertions method (#8… #21401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 6 commits
a77cdfd 21e8c04 1e5d1cc 085313c 236e2c0 f4ffbfc b49792b 59cbdf7 7bdf6f3 44f321a d5ccabf 643e9bf 6fc6a26 87730f3 f36710b 19f9dfa c0bf457 19ce379 File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| | @@ -4806,6 +4806,55 @@ default ``Text`` type for string columns: | |
| Because of this, reading the database table back in does **not** generate | ||
| a categorical. | ||
| | ||
| .. _io.sql.method: | ||
| | ||
| Insertion Method | ||
| ++++++++++++++++ | ||
| | ||
| .. versionadded:: 0.24.0 | ||
| | ||
| The parameter ``method`` controls the SQL insertion clause used. | ||
| Possible values are: | ||
| | ||
| - ``'default'``: Uses standard SQL ``INSERT`` clause (one per row). | ||
| - ``'multi'``: Pass multiple values in a single ``INSERT`` clause. | ||
| It uses a *special* SQL syntax not supported by all backends. | ||
| This usually provides better performance for analytic databases | ||
| like *Presto* and *Redshift*, but has worse performance for | ||
| traditional SQL backend if the table contains many columns. | ||
| For more information check the SQLAlchemy `documention | ||
| <http://docs.sqlalchemy.org/en/latest/core/dml.html#sqlalchemy.sql.expression.Insert.values.params.*args>`__. | ||
| - callable with signature ``(pd_table, conn, keys, data_iter)``: | ||
| This can be used to implement a more performant insertion method based on | ||
| specific backend dialect features. | ||
| | ||
| Example of a callable using PostgreSQL `COPY clause | ||
| <https://www.postgresql.org/docs/current/static/sql-copy.html>`__:: | ||
| | ||
| # Alternative to_sql() *method* for DBs that support COPY FROM | ||
| import csv | ||
| from io import StringIO | ||
| | ||
| def psql_insert_copy(table, conn, keys, data_iter): | ||
| # gets a DBAPI connection that can provide a cursor | ||
| dbapi_conn = conn.connection | ||
| with dbapi_conn.cursor() as cur: | ||
| s_buf = StringIO() | ||
| writer = csv.writer(s_buf) | ||
| writer.writerows(data_iter) | ||
| s_buf.seek(0) | ||
| Member There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there a reason you are not using Further, I think it would be good to actually run the code of this example, to make sure it works. We could either add it as a test case, or either actually run it in the documentation build (by applying it on a small example dataframe) Contributor Author There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My understanding is that Contributor There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree this needs testing. Contributor There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are we fixing this before merging? I can push a commit if we want. Member There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @TomAugspurger fix as in use Contributor There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Started out as that, but this is kind of a can of worms to actually run this example. We would need to actually create a Postgres database. I'll see if something similar exists for sqlite. Contributor There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually nevermind, this is tested below and we don't actually have a DataFrame here we have a Contributor There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So +1 to merging as is. | ||
| | ||
| columns = ', '.join('"{}"'.format(k) for k in keys) | ||
| if table.schema: | ||
| table_name = '{}.{}'.format(table.schema, table.name) | ||
| else: | ||
| table_name = table.name | ||
| | ||
| sql = 'COPY {} ({}) FROM STDIN WITH CSV'.format( | ||
| table_name, columns) | ||
| Contributor There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I realise this is just an example, but isn't this piece vulnerable to SQL injection? the Member There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @dahlbaek no expert on this, but if you could provide the code that you would replace it with to make it more robust, that would certainly be welcome! | ||
| cur.copy_expert(sql=sql, file=s_buf) | ||
| | ||
| | ||
| Reading Tables | ||
| '''''''''''''' | ||
| | ||
| | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "one per row", is that correct?
Because as far as I understand, the default behaviour will be to do a single insert statement with multiple rows (a "executemany"), where the number of rows is determined by the chunksize (and which by default is all rows)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe that is why everybody is complaining that it is so slow 😁