0 / 0
Collecting statistics for all other data sources in Watson Query

Collecting statistics for all other data sources

You can use the DVSYS.COLLECT_STATISTICS procedure to collect statistics for all virtualized objects except objects that are defined in object storage.

The following example shows you how to collect statistics with the remote-catalog collection type of the DVSYS.COLLECT_STATISTICS procedure.
call DVSYS.COLLECT_STATISTICS('schema',  'table_name', 'list_of_columns_commonly_used_in_predicates', 1, null, ?);

This call collects table-level statistics and column-level statistics for the columns that are included in the list_of_columns_commonly_used_in_predicates argument. If the table has few columns, you can collect statistics for all columns in the table by specifying the null value instead of a column list. To avoid collecting column statistics, specify the empty string ('') instead.

The remote-catalog collection type is not supported for all data sources, and some data sources that support the remote-catalog collection type do not support all the key statistics that are used by Watson Query. In either case, it might be necessary to use the remote-query type of statistics collection. The following example is similar to the previous example, but uses the remote-query collection type instead.

call DVSYS.COLLECT_STATISTICS('schema', 'table_name', 'list_of_columns_commonly_used_in_predicates', 2, null, ?);
This type of statistics collection can be resource-intensive and take a long time to complete. If the virtualized table has many rows or statistics are gathered for many columns, consider collecting statistics when the system is less busy. If the table has more than 1000 rows, you can use sampling (specify the TABLESAMPLE option) to reduce the resources that are used by statistics collection and to improve performance. The following example uses a data sampling rate of 25%:
call DVSYS.COLLECT_STATISTICS('schema', 'table_name', 'list_of_columns_commonly_used_in_predicates', 2, 'TABLESAMPLE=25', ?);
Avoid scheduling many statistics collection jobs at the same time.
Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more