Select the data that you want to work with from Data assets or from Connections.
From Data assets:
- Select a data file (the selection includes data files that were already shaped with Data Refinery)
- Select a connected data asset
From Connections:
- Select a connection and file
- Select a connection, folder, and file
- Select a connection, schema, and table or view
Data Refinery supports these file types: Avro, CSV, delimited text files, JSON, Microsoft Excel (xls and xlsx formats. First sheet only, except for connections and connected data assets.), Parquet, SAS with the "sas7bdat" extension
(read only), TSV (read only)
Data Refinery operates on a sample subset of rows in the data set. The sample size is 1 MB or 10,000 rows, whichever comes first. However, when you run a job for the Data Refinery flow, the entire data set is processed. If the Data Refinery
flow fails with a large data asset, see workarounds in Troubleshooting Data Refinery.
Data connections marked with a key icon () are locked. If you are authorized to access the data source, you are asked to enter your personal credentials
the first time you select it. This one-time step permanently unlocks the connection for you. After you have unlocked the connection, the key icon is no longer displayed. See Adding connections to projects.