0 / 0
Adding data to Data Refinery

Adding data to Data Refinery

After you create a project and you create connections or you add data assets to the project, you can then add data to Data Refinery and start prepping that data for analysis.

You can add data to Data Refinery in one of several ways:

  • Select Prepare data from the overflow menu (overflow menu) of a data asset in the All assets list for the project
  • Preview a data asset in the project and then click Prepare data
  • Navigate to Data Refinery first and then add data to it
  1. Access Data Refinery from within a project. Click the Assets tab.

  2. Click New asset > Data Refinery.

  3. Select the data that you want to work with from Data assets or from Connections.

    From Data assets:

    • Select a data file (the selection includes data files that were already shaped with Data Refinery)
    • Select a connected data asset

    From Connections:

    • Select a connection and file
    • Select a connection, folder, and file
    • Select a connection, schema, and table or view

    Data Refinery supports these file types: Avro, CSV, delimited text files, JSON, Microsoft Excel (xls and xlsx formats. First sheet only, except for connections and connected data assets.), Parquet, SAS with the "sas7bdat" extension (read only), TSV (read only)

    Data Refinery operates on a sample subset of rows in the data set. The sample size is 1 MB or 10,000 rows, whichever comes first. However, when you run a job for the Data Refinery flow, the entire data set is processed. If the Data Refinery flow fails with a large data asset, see workarounds in Troubleshooting Data Refinery.

    Data connections marked with a key icon (the key symbol for private connections) are locked. If you are authorized to access the data source, you are asked to enter your personal credentials the first time you select it. This one-time step permanently unlocks the connection for you. After you have unlocked the connection, the key icon is no longer displayed. See Adding connections to projects.

  4. Click Add to load the data into Data Refinery.

Next steps

Parent topic: Refining data

Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more