title |
---|
Syncing and scanning databases |
Metabase periodically runs different types of queries to stay up to date with your database's metadata. Knowing stuff about your data helps Metabase do things like display the right chart for the results automatically, and populate dropdown menus in filter widgets.
There are a couple of things Metabase does:
When Metabase first connects to your database, Metabase performs a sync to determine the metadata of the columns in your tables and automatically assign each column a semantic type.
You can follow the progress of these queries from Admin > Troubleshooting > Logs.
Once the queries are done running, you can view and edit the synced metadata from Admin settings > Table Metadata. For more info, see editing metadata.
By default, Metabase does a lightweight hourly sync and an intensive daily scan of field values. If you have a large database, you might want to choose when syncs and scans happen:
If you've selected Choose when syncs and scans happen > ON, you'll be able to set:
Metabase can scan and cache the values present in a field so it can display things like checkbox filters in dashboards and questions. Metabase will only scan fields that people are actually using in your Metabase. So if people are using a filter widget on a dashboard, Metabase will scan and cache values from that field to include in dropdown menus. If people stop using the filter widget for a couple of weeks, Metabase will stop scanning and caching those values.
If you've selected Choose when syncs and scans happen > ON, you'll see the following options under Scanning for filter values:
To scan values from all the columns in a table:
To scan values from a specific column:
To clear the scanned field values for a table:
You can also tell Metabase to forget the cached values for individual fields by clicking the gear icon on a field and clicking on Discard cached field values.
To prevent Metabase from running syncs and scans against a specific table, change the table visibility to Hidden:
Hiding a table will also prevent it from showing up in the query builder and data reference. People can still query hidden tables from the SQL editor.
Metabase syncs and scans regularly, but if the database administrator has just changed the database schema, or if a lot of data is added automatically at specific times, you may want to write a script that uses the Metabase API to force a sync or scan. Our API provides two ways to initiate a sync or scan of a database:
/api/database/:id/sync_schema
or api/database/:id/rescan_values
endpoints. These endpoints do the same things as going to the database in the Admin Panel and choosing Sync database schema or Re-scan field values respectively. To use these endpoints, you have to authenticate with a user ID and pass a session token in the header of your request./api/notify/db/:id
. We created this endpoint so that people could notify their Metabase to sync after an ETL operation finishes. To use this endpoint, you must pass an API key by defining the MB_API_KEY
environment variable.A Metabase sync is a query that gets a list of updated table and view names, column names, and column data types from your database:
SELECT
TRUE
FROM
"your_schema"."your_table_or_view"
WHERE
1 <> 1
LIMIT 0
This query runs against your database during setup, and again every hour by default. This scanning query is fast with most relational databases, but can be slower with MongoDB and some community-built database drivers. Syncing can't be turned off completely, otherwise Metabase wouldn't work.
A Metabase scan is a query that caches the column values for filter dropdowns by looking at the first 1,000 distinct records from each table, in ascending order:
SELECT
"your_table_or_view"."column" AS "column"
FROM
"your_schema"."your_table_or_view"
GROUP BY
"your_table_or_view"."column"
ORDER BY
"your_table_or_view"."column" ASC
LIMIT 1000
For each record, Metabase only stores the first 100 kilobytes of text, so if you have data with 1,000 characters each (like addresses), and your column has more than 100 unique addresses, Metabase will only cache the first 100 values from the scan query.
Cached column values are displayed in filter dropdown menus. If people type in the filter search box for values that aren't in the first 1,000 distinct records or 100kB of text, Metabase will run a query against your database to look for those values on the fly.
A scan is more intensive than a sync query, so it only runs once during setup, and again once a day by default. If you disable scans entirely, you'll need to bring things up to date by running manual scans.
To reduce the number of tables and fields Metabase needs to scan in order to stay current with your connected database, Metabase will only scan values for fields that someone has queried in the last fourteen days.
Periodic refingerprinting will increase the load on your database.
By default, Metabase only runs fingerprinting queries when you first connect your database.
Turn this setting on if you want Metabase to use larger samples of column values when making suggestions in the UI:
The fingerprinting query looks at the first 10,000 rows from a given table or view in your database:
SELECT
*
FROM
"your_schema"."your_table_or_view"
LIMIT 10000
Metabase uses the results of this query to provide better suggestions in the Metabase UI (such as filter dropdowns and auto-binning). To avoid putting strain on your database, Metabase only runs fingerprinting queries the first time you set up a database connection. To change this default, you can turn ON Periodically refingerprint tables.
Metabase doesn't do any caching or rate limiting during the sync and scan process. If your data appears to be missing or out of date, check out:
Вы можете оставить комментарий после Вход в систему
Неприемлемый контент может быть отображен здесь и не будет показан на странице. Вы можете проверить и изменить его с помощью соответствующей функции редактирования.
Если вы подтверждаете, что содержание не содержит непристойной лексики/перенаправления на рекламу/насилия/вульгарной порнографии/нарушений/пиратства/ложного/незначительного или незаконного контента, связанного с национальными законами и предписаниями, вы можете нажать «Отправить» для подачи апелляции, и мы обработаем ее как можно скорее.
Опубликовать ( 0 )