|
6 | 6 | "source": [ |
7 | 7 | "# Google Cloud SQL for MySQL\n", |
8 | 8 | "\n", |
9 | | - "> [Cloud SQL](https://cloud.google.com/sql) is a fully managed relational database service that offers high performance, seamless integration, and impressive scalability. It offers [MySQL](https://cloud.google.com/sql/mysql), [PostgreSQL](https://cloud.google.com/sql/postgres), and [SQL Server](https://cloud.google.com/sql/sqlserver) database engines. Extend your database application to build AI-powered experiences leveraging Cloud SQL's Langchain integrations.\n", |
| 9 | + "> [Cloud SQL](https://cloud.google.com/sql) is a fully managed relational database service that offers high performance, seamless integration, and impressive scalability. It offers [MySQL](https://cloud.google.com/sql/mysql), [PostgreSQL](https://cloud.google.com/sql/postgresql), and [SQL Server](https://cloud.google.com/sql/sqlserver) database engines. Extend your database application to build AI-powered experiences leveraging Cloud SQL's Langchain integrations.\n", |
10 | 10 | "\n", |
11 | 11 | "This notebook goes over how to use [Cloud SQL for MySQL](https://cloud.google.com/sql/mysql) to [save, load and delete langchain documents](https://python.langchain.com/docs/modules/data_connection/document_loaders/) with `MySQLLoader` and `MySQLDocumentSaver`.\n", |
12 | 12 | "\n", |
| 13 | + "Learn more about the package on [GitHub](https://github.com/googleapis/langchain-google-cloud-sql-mysql-python/).\n", |
| 14 | + "\n", |
13 | 15 | "[](https://colab.research.google.com/github/googleapis/langchain-google-cloud-sql-mysql-python/blob/main/docs/document_loader.ipynb)" |
14 | 16 | ] |
15 | 17 | }, |
|
20 | 22 | "## Before You Begin\n", |
21 | 23 | "\n", |
22 | 24 | "To run this notebook, you will need to do the following:\n", |
| 25 | + "\n", |
23 | 26 | "* [Create a Google Cloud Project](https://developers.google.com/workspace/guides/create-project)\n", |
| 27 | + "* [Enable the Cloud SQL Admin API.](https://console.cloud.google.com/marketplace/product/google/sqladmin.googleapis.com)\n", |
24 | 28 | "* [Create a Cloud SQL for MySQL instance](https://cloud.google.com/sql/docs/mysql/create-instance)\n", |
25 | 29 | "* [Create a Cloud SQL database](https://cloud.google.com/sql/docs/mysql/create-manage-databases)\n", |
26 | 30 | "* [Add an IAM database user to the database](https://cloud.google.com/sql/docs/mysql/add-manage-iam-users#creating-a-database-user) (Optional)\n", |
|
136 | 140 | "auth.authenticate_user()" |
137 | 141 | ] |
138 | 142 | }, |
139 | | - { |
140 | | - "cell_type": "markdown", |
141 | | - "metadata": {}, |
142 | | - "source": [ |
143 | | - "### API Enablement\n", |
144 | | - "The `langchain-google-cloud-sql-mysql` package requires that you [enable the Cloud SQL Admin API](https://console.cloud.google.com/flows/enableapi?apiid=sqladmin.googleapis.com) in your Google Cloud Project." |
145 | | - ] |
146 | | - }, |
147 | | - { |
148 | | - "cell_type": "code", |
149 | | - "execution_count": null, |
150 | | - "metadata": {}, |
151 | | - "outputs": [], |
152 | | - "source": [ |
153 | | - "# enable Cloud SQL Admin API\n", |
154 | | - "!gcloud services enable sqladmin.googleapis.com" |
155 | | - ] |
156 | | - }, |
157 | 143 | { |
158 | 144 | "cell_type": "markdown", |
159 | 145 | "metadata": {}, |
|
179 | 165 | "By default, [IAM database authentication](https://cloud.google.com/sql/docs/mysql/iam-authentication#iam-db-auth) will be used as the method of database authentication. This library uses the IAM principal belonging to the [Application Default Credentials (ADC)](https://cloud.google.com/docs/authentication/application-default-credentials) sourced from the envionment.\n", |
180 | 166 | "\n", |
181 | 167 | "For more informatin on IAM database authentication please see:\n", |
| 168 | + "\n", |
182 | 169 | "* [Configure an instance for IAM database authentication](https://cloud.google.com/sql/docs/mysql/create-edit-iam-instances)\n", |
183 | 170 | "* [Manage users with IAM database authentication](https://cloud.google.com/sql/docs/mysql/add-manage-iam-users)\n", |
184 | 171 | "\n", |
185 | 172 | "Optionally, [built-in database authentication](https://cloud.google.com/sql/docs/mysql/built-in-authentication) using a username and password to access the Cloud SQL database can also be used. Just provide the optional `user` and `password` arguments to `MySQLEngine.from_instance()`:\n", |
| 173 | + "\n", |
186 | 174 | "* `user` : Database user to use for built-in database authentication and login\n", |
187 | 175 | "* `password` : Database password to use for built-in database authentication and login." |
188 | 176 | ] |
|
207 | 195 | "### Initialize a table\n", |
208 | 196 | "\n", |
209 | 197 | "Initialize a table of default schema via `MySQLEngine.init_document_table(<table_name>)`. Table Columns:\n", |
| 198 | + "\n", |
210 | 199 | "- page_content (type: text)\n", |
211 | 200 | "- langchain_metadata (type: JSON)\n", |
212 | 201 | "\n", |
|
229 | 218 | "### Save documents\n", |
230 | 219 | "\n", |
231 | 220 | "Save langchain documents with `MySQLDocumentSaver.add_documents(<documents>)`. To initialize `MySQLDocumentSaver` class you need to provide 2 things:\n", |
| 221 | + "\n", |
232 | 222 | "1. `engine` - An instance of a `MySQLEngine` engine.\n", |
233 | 223 | "2. `table_name` - The name of the table within the Cloud SQL database to store langchain documents." |
234 | 224 | ] |
|
241 | 231 | }, |
242 | 232 | "outputs": [], |
243 | 233 | "source": [ |
244 | | - "from langchain_google_cloud_sql_mysql import MySQLDocumentSaver\n", |
245 | 234 | "from langchain_core.documents import Document\n", |
| 235 | + "from langchain_google_cloud_sql_mysql import MySQLDocumentSaver\n", |
246 | 236 | "\n", |
247 | 237 | "test_docs = [\n", |
248 | 238 | " Document(\n", |
|
274 | 264 | "metadata": {}, |
275 | 265 | "source": [ |
276 | 266 | "Load langchain documents with `MySQLLoader.load()` or `MySQLLoader.lazy_load()`. `lazy_load` returns a generator that only queries database during the iteration. To initialize `MySQLLoader` class you need to provide:\n", |
| 267 | + "\n", |
277 | 268 | "1. `engine` - An instance of a `MySQLEngine` engine.\n", |
278 | 269 | "2. `table_name` - The name of the table within the Cloud SQL database to store langchain documents." |
279 | 270 | ] |
|
345 | 336 | "For table with default schema (page_content, langchain_metadata), the deletion criteria is:\n", |
346 | 337 | "\n", |
347 | 338 | "A `row` should be deleted if there exists a `document` in the list, such that\n", |
| 339 | + "\n", |
348 | 340 | "- `document.page_content` equals `row[page_content]`\n", |
349 | 341 | "- `document.metadata` equals `row[langchain_metadata]`" |
350 | 342 | ] |
|
402 | 394 | " CREATE TABLE IF NOT EXISTS `{TABLE_NAME}`(\n", |
403 | 395 | " fruit_id INT AUTO_INCREMENT PRIMARY KEY,\n", |
404 | 396 | " fruit_name VARCHAR(100) NOT NULL,\n", |
405 | | - " variety VARCHAR(50), \n", |
| 397 | + " variety VARCHAR(50),\n", |
406 | 398 | " quantity_in_stock INT NOT NULL,\n", |
407 | 399 | " price_per_unit DECIMAL(6,2) NOT NULL,\n", |
408 | 400 | " organic TINYINT(1) NOT NULL\n", |
|
449 | 441 | "metadata": {}, |
450 | 442 | "source": [ |
451 | 443 | "We can specify the content and metadata we want to load by setting the `content_columns` and `metadata_columns` when initializing the `MySQLLoader`.\n", |
| 444 | + "\n", |
452 | 445 | "1. `content_columns`: The columns to write into the `page_content` of the document.\n", |
453 | 446 | "2. `metadata_columns`: The columns to write into the `metadata` of the document.\n", |
454 | 447 | "\n", |
|
487 | 480 | "metadata": {}, |
488 | 481 | "source": [ |
489 | 482 | "In order to save langchain document into table with customized metadata fields. We need first create such a table via `MySQLEngine.init_document_table()`, and specify the list of `metadata_columns` we want it to have. In this example, the created table will have table columns:\n", |
| 483 | + "\n", |
490 | 484 | "- description (type: text): for storing fruit description.\n", |
491 | 485 | "- fruit_name (type text): for storing fruit name.\n", |
492 | 486 | "- organic (type tinyint(1)): to tell if the fruit is organic.\n", |
493 | 487 | "- other_metadata (type: JSON): for storing other metadata information of the fruit.\n", |
494 | 488 | "\n", |
495 | 489 | "We can use the following parameters with `MySQLEngine.init_document_table()` to create the table:\n", |
| 490 | + "\n", |
496 | 491 | "1. `table_name`: The name of the table within the Cloud SQL database to store langchain documents.\n", |
497 | 492 | "2. `metadata_columns`: A list of `sqlalchemy.Column` indicating the list of metadata columns we need.\n", |
498 | 493 | "3. `content_column`: The name of column to store `page_content` of langchain document. Default: `page_content`.\n", |
|
532 | 527 | "metadata": {}, |
533 | 528 | "source": [ |
534 | 529 | "Save documents with `MySQLDocumentSaver.add_documents(<documents>)`. As you can see in this example, \n", |
| 530 | + "\n", |
535 | 531 | "- `document.page_content` will be saved into `description` column.\n", |
536 | 532 | "- `document.metadata.fruit_name` will be saved into `fruit_name` column.\n", |
537 | 533 | "- `document.metadata.organic` will be saved into `organic` column.\n", |
|
585 | 581 | "We can also delete documents from table with customized metadata columns via `MySQLDocumentSaver.delete(<documents>)`. The deletion criteria is:\n", |
586 | 582 | "\n", |
587 | 583 | "A `row` should be deleted if there exists a `document` in the list, such that\n", |
| 584 | + "\n", |
588 | 585 | "- `document.page_content` equals `row[page_content]`\n", |
589 | 586 | "- For every metadata field `k` in `document.metadata`\n", |
590 | 587 | " - `document.metadata[k]` equals `row[k]` or `document.metadata[k]` equals `row[langchain_metadata][k]`\n", |
|
0 commit comments