from IPython.display import Markdown, display, HTML
import warnings
warnings.filterwarnings('ignore')


!pip install --index-url https://pypi.python.org/simple  -U "pip"
!git clone https://github.com/watson-developer-cloud/assistant-skill-analysis.git
!pip install ./assistant-skill-analysis

Looking in indexes: https://pypi.python.org/simple
Requirement already satisfied: pip in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (23.3)
Collecting pip
  Downloading pip-23.3.2-py3-none-any.whl.metadata (3.5 kB)
Downloading pip-23.3.2-py3-none-any.whl (2.1 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.1/2.1 MB 80.9 MB/s eta 0:00:00
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 23.3
    Uninstalling pip-23.3:
      Successfully uninstalled pip-23.3
Successfully installed pip-23.3.2
Cloning into 'assistant-skill-analysis'...
remote: Enumerating objects: 775, done.
remote: Counting objects: 100% (302/302), done.
remote: Compressing objects: 100% (139/139), done.
remote: Total 775 (delta 197), reused 220 (delta 153), pack-reused 473
Receiving objects: 100% (775/775), 255.83 KiB | 2.24 MiB/s, done.
Resolving deltas: 100% (397/397), done.
Processing ./assistant-skill-analysis
  Preparing metadata (setup.py) ... done
Collecting scikit-learn~=1.2.2 (from assistant-skill-analysis==2.0.1)
  Downloading scikit_learn-1.2.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.6/9.6 MB 112.3 MB/s eta 0:00:0000:0100:01
Collecting pandas~=1.4.3 (from assistant-skill-analysis==2.0.1)
  Downloading pandas-1.4.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.6/11.6 MB 110.4 MB/s eta 0:00:00a 0:00:01
Requirement already satisfied: tabulate in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from assistant-skill-analysis==2.0.1) (0.8.10)
Requirement already satisfied: matplotlib in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from assistant-skill-analysis==2.0.1) (3.7.1)
Collecting nltk (from assistant-skill-analysis==2.0.1)
  Downloading nltk-3.8.1-py3-none-any.whl (1.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.5/1.5 MB 98.2 MB/s eta 0:00:00
Requirement already satisfied: seaborn in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from assistant-skill-analysis==2.0.1) (0.12.2)
Collecting ibm-watson>=4.5.0 (from assistant-skill-analysis==2.0.1)
  Downloading ibm-watson-7.0.1.tar.gz (389 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 389.3/389.3 kB 62.7 MB/s eta 0:00:00
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: scipy>=1.2.0 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from assistant-skill-analysis==2.0.1) (1.10.1)
Requirement already satisfied: jupyter in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from assistant-skill-analysis==2.0.1) (1)
Collecting spacy~=2.3.2 (from assistant-skill-analysis==2.0.1)
  Downloading spacy-2.3.9-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.9/4.9 MB 106.8 MB/s eta 0:00:0000:01
Requirement already satisfied: ibm-cos-sdk>=2.11.0 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from assistant-skill-analysis==2.0.1) (2.12.0)
Collecting nbconvert>=7.7.1 (from assistant-skill-analysis==2.0.1)
  Downloading nbconvert-7.14.2-py3-none-any.whl.metadata (7.7 kB)
Collecting numpy~=1.26.0 (from assistant-skill-analysis==2.0.1)
  Downloading numpy-1.26.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (61 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.2/61.2 kB 14.7 MB/s eta 0:00:00
Requirement already satisfied: ibm-cos-sdk-core==2.12.0 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from ibm-cos-sdk>=2.11.0->assistant-skill-analysis==2.0.1) (2.12.0)
Requirement already satisfied: ibm-cos-sdk-s3transfer==2.12.0 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from ibm-cos-sdk>=2.11.0->assistant-skill-analysis==2.0.1) (2.12.0)
Requirement already satisfied: jmespath<1.0.0,>=0.10.0 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from ibm-cos-sdk>=2.11.0->assistant-skill-analysis==2.0.1) (0.10.0)
Requirement already satisfied: python-dateutil<3.0.0,>=2.8.2 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from ibm-cos-sdk-core==2.12.0->ibm-cos-sdk>=2.11.0->assistant-skill-analysis==2.0.1) (2.8.2)
Requirement already satisfied: requests<3.0,>=2.27.1 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from ibm-cos-sdk-core==2.12.0->ibm-cos-sdk>=2.11.0->assistant-skill-analysis==2.0.1) (2.31.0)
Requirement already satisfied: urllib3<1.27,>=1.26.9 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from ibm-cos-sdk-core==2.12.0->ibm-cos-sdk>=2.11.0->assistant-skill-analysis==2.0.1) (1.26.18)
Collecting websocket-client>=1.1.0 (from ibm-watson>=4.5.0->assistant-skill-analysis==2.0.1)
  Downloading websocket_client-1.7.0-py3-none-any.whl.metadata (7.9 kB)
Requirement already satisfied: ibm-cloud-sdk-core==3.*,>=3.3.6 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from ibm-watson>=4.5.0->assistant-skill-analysis==2.0.1) (3.16.5)
Requirement already satisfied: PyJWT<3.0.0,>=2.4.0 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from ibm-cloud-sdk-core==3.*,>=3.3.6->ibm-watson>=4.5.0->assistant-skill-analysis==2.0.1) (2.4.0)
Requirement already satisfied: beautifulsoup4 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from nbconvert>=7.7.1->assistant-skill-analysis==2.0.1) (4.12.0)
Collecting bleach!=5.0.0 (from nbconvert>=7.7.1->assistant-skill-analysis==2.0.1)
  Downloading bleach-6.1.0-py3-none-any.whl.metadata (30 kB)
Collecting defusedxml (from nbconvert>=7.7.1->assistant-skill-analysis==2.0.1)
  Downloading defusedxml-0.7.1-py2.py3-none-any.whl (25 kB)
Requirement already satisfied: jinja2>=3.0 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from nbconvert>=7.7.1->assistant-skill-analysis==2.0.1) (3.1.2)
Requirement already satisfied: jupyter-core>=4.7 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from nbconvert>=7.7.1->assistant-skill-analysis==2.0.1) (5.3.0)
Collecting jupyterlab-pygments (from nbconvert>=7.7.1->assistant-skill-analysis==2.0.1)
  Downloading jupyterlab_pygments-0.3.0-py3-none-any.whl.metadata (4.4 kB)
Requirement already satisfied: markupsafe>=2.0 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from nbconvert>=7.7.1->assistant-skill-analysis==2.0.1) (2.1.1)
Collecting mistune<4,>=2.0.3 (from nbconvert>=7.7.1->assistant-skill-analysis==2.0.1)
  Downloading mistune-3.0.2-py3-none-any.whl.metadata (1.7 kB)
Collecting nbclient>=0.5.0 (from nbconvert>=7.7.1->assistant-skill-analysis==2.0.1)
  Downloading nbclient-0.9.0-py3-none-any.whl.metadata (7.8 kB)
Requirement already satisfied: nbformat>=5.7 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from nbconvert>=7.7.1->assistant-skill-analysis==2.0.1) (5.7.0)
Requirement already satisfied: packaging in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from nbconvert>=7.7.1->assistant-skill-analysis==2.0.1) (23.0)
Collecting pandocfilters>=1.4.1 (from nbconvert>=7.7.1->assistant-skill-analysis==2.0.1)
  Downloading pandocfilters-1.5.0-py2.py3-none-any.whl (8.7 kB)
Requirement already satisfied: pygments>=2.4.1 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from nbconvert>=7.7.1->assistant-skill-analysis==2.0.1) (2.15.1)
Collecting tinycss2 (from nbconvert>=7.7.1->assistant-skill-analysis==2.0.1)
  Downloading tinycss2-1.2.1-py3-none-any.whl (21 kB)
Requirement already satisfied: traitlets>=5.1 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from nbconvert>=7.7.1->assistant-skill-analysis==2.0.1) (5.7.1)
Requirement already satisfied: pytz>=2020.1 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from pandas~=1.4.3->assistant-skill-analysis==2.0.1) (2022.7)
Requirement already satisfied: joblib>=1.1.1 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from scikit-learn~=1.2.2->assistant-skill-analysis==2.0.1) (1.1.1)
Requirement already satisfied: threadpoolctl>=2.0.0 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from scikit-learn~=1.2.2->assistant-skill-analysis==2.0.1) (2.2.0)
Collecting murmurhash<1.1.0,>=0.28.0 (from spacy~=2.3.2->assistant-skill-analysis==2.0.1)
  Downloading murmurhash-1.0.10-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.0 kB)
Collecting cymem<2.1.0,>=2.0.2 (from spacy~=2.3.2->assistant-skill-analysis==2.0.1)
  Downloading cymem-2.0.8-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (8.4 kB)
Collecting preshed<3.1.0,>=3.0.2 (from spacy~=2.3.2->assistant-skill-analysis==2.0.1)
  Downloading preshed-3.0.9-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.2 kB)
Collecting thinc<7.5.0,>=7.4.1 (from spacy~=2.3.2->assistant-skill-analysis==2.0.1)
  Downloading thinc-7.4.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.0/1.0 MB 80.1 MB/s eta 0:00:00
Collecting blis<0.8.0,>=0.4.0 (from spacy~=2.3.2->assistant-skill-analysis==2.0.1)
  Downloading blis-0.7.11-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (7.4 kB)
Collecting wasabi<1.1.0,>=0.4.0 (from spacy~=2.3.2->assistant-skill-analysis==2.0.1)
  Downloading wasabi-0.10.1-py3-none-any.whl (26 kB)
Collecting srsly<1.1.0,>=1.0.2 (from spacy~=2.3.2->assistant-skill-analysis==2.0.1)
  Downloading srsly-1.0.7-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (13 kB)
Collecting catalogue<1.1.0,>=0.0.7 (from spacy~=2.3.2->assistant-skill-analysis==2.0.1)
  Downloading catalogue-1.0.2-py2.py3-none-any.whl (16 kB)
Requirement already satisfied: tqdm<5.0.0,>=4.38.0 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from spacy~=2.3.2->assistant-skill-analysis==2.0.1) (4.65.0)
Requirement already satisfied: setuptools in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from spacy~=2.3.2->assistant-skill-analysis==2.0.1) (65.6.3)
Collecting plac<1.2.0,>=0.9.6 (from spacy~=2.3.2->assistant-skill-analysis==2.0.1)
  Downloading plac-1.1.3-py2.py3-none-any.whl (20 kB)
Requirement already satisfied: contourpy>=1.0.1 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from matplotlib->assistant-skill-analysis==2.0.1) (1.0.5)
Requirement already satisfied: cycler>=0.10 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from matplotlib->assistant-skill-analysis==2.0.1) (0.11.0)
Requirement already satisfied: fonttools>=4.22.0 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from matplotlib->assistant-skill-analysis==2.0.1) (4.25.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from matplotlib->assistant-skill-analysis==2.0.1) (1.4.4)
Requirement already satisfied: pillow>=6.2.0 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from matplotlib->assistant-skill-analysis==2.0.1) (10.0.1)
Requirement already satisfied: pyparsing>=2.3.1 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from matplotlib->assistant-skill-analysis==2.0.1) (3.0.9)
Requirement already satisfied: click in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from nltk->assistant-skill-analysis==2.0.1) (8.0.4)
Collecting regex>=2021.8.3 (from nltk->assistant-skill-analysis==2.0.1)
  Downloading regex-2023.12.25-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (40 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 40.9/40.9 kB 9.9 MB/s eta 0:00:00
Requirement already satisfied: six>=1.9.0 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from bleach!=5.0.0->nbconvert>=7.7.1->assistant-skill-analysis==2.0.1) (1.16.0)
Collecting webencodings (from bleach!=5.0.0->nbconvert>=7.7.1->assistant-skill-analysis==2.0.1)
  Downloading webencodings-0.5.1-py2.py3-none-any.whl (11 kB)
Requirement already satisfied: platformdirs>=2.5 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from jupyter-core>=4.7->nbconvert>=7.7.1->assistant-skill-analysis==2.0.1) (2.5.2)
Requirement already satisfied: jupyter-client>=6.1.12 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from nbclient>=0.5.0->nbconvert>=7.7.1->assistant-skill-analysis==2.0.1) (8.1.0)
Requirement already satisfied: fastjsonschema in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from nbformat>=5.7->nbconvert>=7.7.1->assistant-skill-analysis==2.0.1) (2.16.2)
Requirement already satisfied: jsonschema>=2.6 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from nbformat>=5.7->nbconvert>=7.7.1->assistant-skill-analysis==2.0.1) (4.17.3)
Requirement already satisfied: charset-normalizer<4,>=2 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from requests<3.0,>=2.27.1->ibm-cos-sdk-core==2.12.0->ibm-cos-sdk>=2.11.0->assistant-skill-analysis==2.0.1) (2.0.4)
Requirement already satisfied: idna<4,>=2.5 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from requests<3.0,>=2.27.1->ibm-cos-sdk-core==2.12.0->ibm-cos-sdk>=2.11.0->assistant-skill-analysis==2.0.1) (3.4)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from requests<3.0,>=2.27.1->ibm-cos-sdk-core==2.12.0->ibm-cos-sdk>=2.11.0->assistant-skill-analysis==2.0.1) (2023.11.17)
Requirement already satisfied: soupsieve>1.2 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from beautifulsoup4->nbconvert>=7.7.1->assistant-skill-analysis==2.0.1) (2.4)
Requirement already satisfied: attrs>=17.4.0 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert>=7.7.1->assistant-skill-analysis==2.0.1) (23.1.0)
Requirement already satisfied: pyrsistent!=0.17.0,!=0.17.1,!=0.17.2,>=0.14.0 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert>=7.7.1->assistant-skill-analysis==2.0.1) (0.18.0)
Requirement already satisfied: pyzmq>=23.0 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert>=7.7.1->assistant-skill-analysis==2.0.1) (23.2.0)
Requirement already satisfied: tornado>=6.2 in /opt/conda/envs/Python-RT23.1/lib/python3.10/site-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert>=7.7.1->assistant-skill-analysis==2.0.1) (6.3.3)
Downloading nbconvert-7.14.2-py3-none-any.whl (256 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 256.4/256.4 kB 43.7 MB/s eta 0:00:00
Downloading numpy-1.26.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.2 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.2/18.2 MB 97.5 MB/s eta 0:00:00:00:0100:01
Downloading bleach-6.1.0-py3-none-any.whl (162 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 162.8/162.8 kB 34.2 MB/s eta 0:00:00
Downloading blis-0.7.11-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (10.2 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.2/10.2 MB 110.6 MB/s eta 0:00:0000:010:01
Downloading cymem-2.0.8-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (46 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 46.1/46.1 kB 11.5 MB/s eta 0:00:00
Downloading mistune-3.0.2-py3-none-any.whl (47 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 48.0/48.0 kB 11.9 MB/s eta 0:00:00
Downloading murmurhash-1.0.10-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (29 kB)
Downloading nbclient-0.9.0-py3-none-any.whl (24 kB)
Downloading preshed-3.0.9-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (156 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 156.9/156.9 kB 33.4 MB/s eta 0:00:00
Downloading regex-2023.12.25-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (773 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 774.0/774.0 kB 77.4 MB/s eta 0:00:00
Downloading srsly-1.0.7-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (369 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 369.2/369.2 kB 62.1 MB/s eta 0:00:00
Downloading websocket_client-1.7.0-py3-none-any.whl (58 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 58.5/58.5 kB 10.0 MB/s eta 0:00:00
Downloading jupyterlab_pygments-0.3.0-py3-none-any.whl (15 kB)
Building wheels for collected packages: assistant-skill-analysis, ibm-watson
  Building wheel for assistant-skill-analysis (setup.py) ... done
  Created wheel for assistant-skill-analysis: filename=assistant_skill_analysis-2.0.1-py3-none-any.whl size=56186 sha256=101e976bbb426bbaad786bfd3bf745dbe96a8dfaf9dd67e6363a164054d506eb
  Stored in directory: /tmp/wsuser/.cache/pip/wheels/9c/40/17/84bf40608976beb419a821728198dd0c995d4d242cec338eb0
  Building wheel for ibm-watson (pyproject.toml) ... done
  Created wheel for ibm-watson: filename=ibm_watson-7.0.1-py3-none-any.whl size=389784 sha256=8f356f830b701b63ac3d620ec7e2bdbc430ff8031d916361e869742515b9dad0
  Stored in directory: /tmp/wsuser/.cache/pip/wheels/34/df/f4/f8edc5ba0637dd4bfb2029741ae20402976a49d1b6bc113553
Successfully built assistant-skill-analysis ibm-watson
Installing collected packages: webencodings, wasabi, plac, cymem, websocket-client, tinycss2, srsly, regex, pandocfilters, numpy, murmurhash, mistune, jupyterlab-pygments, defusedxml, catalogue, bleach, preshed, pandas, nltk, blis, thinc, scikit-learn, nbclient, ibm-watson, spacy, nbconvert, assistant-skill-analysis
  Attempting uninstall: numpy
    Found existing installation: numpy 1.23.5
    Uninstalling numpy-1.23.5:
      Successfully uninstalled numpy-1.23.5
  Attempting uninstall: pandas
    Found existing installation: pandas 1.5.3
    Uninstalling pandas-1.5.3:
      Successfully uninstalled pandas-1.5.3
  Attempting uninstall: scikit-learn
    Found existing installation: scikit-learn 1.1.1
    Uninstalling scikit-learn-1.1.1:
      Successfully uninstalled scikit-learn-1.1.1
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
autoai-libs 1.15.2 requires numpy<1.24,>=1.20.3; python_version >= "3.9", but you have numpy 1.26.3 which is incompatible.
autoai-libs 1.15.2 requires scikit-learn<1.2,>=1.0.2; python_version >= "3.9", but you have scikit-learn 1.2.2 which is incompatible.
autoai-ts-libs 3.0.17 requires numpy<1.24,>=1.19.2; python_version >= "3.9", but you have numpy 1.26.3 which is incompatible.
autoai-ts-libs 3.0.17 requires scikit-learn<=1.1.1,>=1.0.2; python_version >= "3.9", but you have scikit-learn 1.2.2 which is incompatible.
lale 0.7.10 requires numpy<1.24, but you have numpy 1.26.3 which is incompatible.
lale 0.7.10 requires scikit-learn<=1.2.0,>=1.0.0, but you have scikit-learn 1.2.2 which is incompatible.
numba 0.57.0 requires numpy<1.25,>=1.21, but you have numpy 1.26.3 which is incompatible.
tensorflow 2.12.0 requires numpy<1.24,>=1.22, but you have numpy 1.26.3 which is incompatible.
Successfully installed assistant-skill-analysis-2.0.1 bleach-6.1.0 blis-0.7.11 catalogue-1.0.2 cymem-2.0.8 defusedxml-0.7.1 ibm-watson-7.0.1 jupyterlab-pygments-0.3.0 mistune-3.0.2 murmurhash-1.0.10 nbclient-0.9.0 nbconvert-7.14.2 nltk-3.8.1 numpy-1.26.3 pandas-1.4.4 pandocfilters-1.5.0 plac-1.1.3 preshed-3.0.9 regex-2023.12.25 scikit-learn-1.2.2 spacy-2.3.9 srsly-1.0.7 thinc-7.4.6 tinycss2-1.2.1 wasabi-0.10.1 webencodings-0.5.1 websocket-client-1.7.0


# Standard python libraries
import sys, os
import json
import importlib
from collections import Counter

# External python libraries
import pandas as pd
import numpy as np
import nltk
nltk.download('stopwords')
nltk.download('punkt')
import ibm_watson

# Internal python libraries
from assistant_skill_analysis.utils import skills_util, lang_utils
from assistant_skill_analysis.highlighting import highlighter
from assistant_skill_analysis.data_analysis import summary_generator
from assistant_skill_analysis.data_analysis import divergence_analyzer
from assistant_skill_analysis.data_analysis import similarity_analyzer
from assistant_skill_analysis.term_analysis import chi2_analyzer
from assistant_skill_analysis.term_analysis import keyword_analyzer
from assistant_skill_analysis.term_analysis import entity_analyzer
from assistant_skill_analysis.confidence_analysis import confidence_analyzer
from assistant_skill_analysis.inferencing import inferencer
from assistant_skill_analysis.experimentation import data_manipulator

[nltk_data] Downloading package stopwords to /home/wsuser/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package punkt to /home/wsuser/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


importlib.reload(skills_util)

# Change Assistant API version if needed
# Find Latest --> https://cloud.ibm.com/docs/services/assistant?topic=assistant-release-notes
API_VERSION = '2020-04-01'

# choose a datacenter to use
datacenters = {
    'dallas': ('https://api.us-south.assistant.watson.cloud.ibm.com', 'https://iam.cloud.ibm.com/identity/token'),
    'washington': ('https://api.us-east.assistant.watson.cloud.ibm.com',     'https://iam.cloud.ibm.com/identity/token'),
    'frankfurt' : ('https://api.eu-de.assistant.watson.cloud.ibm.com',     'https://iam.cloud.ibm.com/identity/token'),
    'sydney'    : ('https://api.au-syd.assistant.watson.cloud.ibm.com',     'https://iam.cloud.ibm.com/identity/token'),
    'tokyo'     : ('https://api.jp-tok.assistant.watson.cloud.ibm.com',     'https://iam.cloud.ibm.com/identity/token'),
    'london'    : ('https://api.eu-gb.assistant.watson.cloud.ibm.com',     'https://iam.cloud.ibm.com/identity/token'),
}

URL, authenticator_url = datacenters['dallas']

# For ICP(IBM Cloud Private), you can disable SSL verification by changing this to True
DISABLE_SSL_VERTIFICATION = False 

# By default we only need the IAM API Key & the Workspace ID

# If you run the notebook regularly you can uncomment the two lines below
# & comment out the line after it
#iam_apikey = '###'
#skill_id = '###'
iam_apikey, skill_id, _ = skills_util.input_credentials()
conversation = skills_util.retrieve_conversation(iam_apikey=iam_apikey,
                                                 url=URL,
                                                 api_version=API_VERSION,
                                                 authenticator_url=authenticator_url)

#If you do not have IAM based API Keys
#but have access to a Username, Password & Workspace ID
#You can comment out the two lines above & uncomment the lines below to authenticate
# username = 'apikey'
# password = '###'
# skill_id = '###'
# conversation = skills_util.retrieve_conversation(username=username,
#                                              password=password,                                                       
#                                              url=URL,
#                                              api_version=API_VERSION)


conversation.set_disable_ssl_verification(DISABLE_SSL_VERTIFICATION)

workspace = skills_util.retrieve_workspace(skill_id=skill_id,
                                           conversation=conversation)

Please enter apikey: ········
Please enter skill-id (workspace_id): ········


LANGUAGE_CODE="en" # change the language code to work with other languages
lang_util = lang_utils.LanguageUtility(LANGUAGE_CODE)


# Extract user workspace
workspace_pd, workspace_vocabulary, entities, _ = skills_util.extract_workspace_data(workspace, language_util=lang_util)
entities_list = [item['entity'] for item in entities]


display(Markdown("### Sample of Utterances & Intents"))
display(HTML(workspace_pd.sample(n = len(workspace_pd) if len(workspace_pd)<10 else 10)
             .to_html(index=False)))
if entities_list:
    display(Markdown("### Sample of Entities"))
    display(HTML(pd.DataFrame({"Entity":entities_list})
                 .sample(n = len(entities_list) if len(entities_list)<10 else 10)
                 .to_html(index=False)))


summary_generator.generate_summary_statistics(workspace_pd, entities_list)


class_imb_flag = summary_generator.class_imbalance_analysis(workspace_pd)


summary_generator.scatter_plot_intent_dist(workspace_pd)

findfont: Font family 'normal' not found.
findfont: Font family 'normal' not found.
findfont: Font family 'normal' not found.
findfont: Font family 'normal' not found.
findfont: Font family 'normal' not found.
findfont: Font family 'normal' not found.
findfont: Font family 'normal' not found.
findfont: Font family 'normal' not found.
findfont: Font family 'normal' not found.
findfont: Font family 'normal' not found.


summary_generator.show_user_examples_per_intent(workspace_pd)


unigram_intent_dict, bigram_intent_dict = chi2_analyzer.get_chi2_analysis(workspace_pd, lang_util=lang_util)


INTENTS_TO_DISPLAY = 30  # Total number of intents for display
MAX_TERMS_DISPLAY = 30  # Total number of terms to display

intent_list = []
keyword_analyzer.seaborn_heatmap(workspace_pd, lang_util, INTENTS_TO_DISPLAY, MAX_TERMS_DISPLAY, intent_list)

findfont: Font family 'normal' not found.
findfont: Font family 'normal' not found.
findfont: Font family 'normal' not found.
findfont: Font family 'normal' not found.
findfont: Font family 'normal' not found.
findfont: Font family 'normal' not found.
findfont: Font family 'normal' not found.
findfont: Font family 'normal' not found.
findfont: Font family 'normal' not found.
findfont: Font family 'normal' not found.


# intent_list = ['intent1','intent2','intent3'] 
intent_list = ['Customer_Care_Appointments'] 


MAX_TERMS_DISPLAY = 20  # Total number of terms to display

if intent_list: 
    keyword_analyzer.seaborn_heatmap(workspace_pd, lang_util, INTENTS_TO_DISPLAY, MAX_TERMS_DISPLAY, intent_list)

findfont: Font family 'normal' not found.
findfont: Font family 'normal' not found.
findfont: Font family 'normal' not found.
findfont: Font family 'normal' not found.
findfont: Font family 'normal' not found.
findfont: Font family 'normal' not found.
findfont: Font family 'normal' not found.
findfont: Font family 'normal' not found.
findfont: Font family 'normal' not found.
findfont: Font family 'normal' not found.


ambiguous_unigram_df = chi2_analyzer.get_confusing_key_terms(unigram_intent_dict)


ambiguous_bigram_df = chi2_analyzer.get_confusing_key_terms(bigram_intent_dict)


# Add specific intent or intent pairs for which you would like to see overlap
intent1 = 'General_Connect_to_Agent'
intent2 = 'General_Greetings'
chi2_analyzer.chi2_overlap_check(ambiguous_unigram_df,ambiguous_bigram_df,intent1,intent2)


similar_utterance_diff_intent_pd = similarity_analyzer.ambiguous_examples_analysis(workspace_pd, lang_util)


import types
from botocore.client import Config
import ibm_boto3


# The following code accesses a csv file in your IBM Cloud Object Storage.
ENDPOINT_URL = 'https://s3.us-east.cloud-object-storage.appdomain.cloud' # change this based on the region of your cos bucket 


# please fill in the details here:
COS_API_KEY_ID = 'YOUR_COS_API_KEY'
RESOURCE_INSTANCE_ID = 'YOUR_COS_RESOURCE_INSTANCE_ID'
IBM_COS_BUCKET = 'YOUR_COS_BUCKET_NAME'
IBM_COS_FILE_KEY = 'YOUR_COS_FILE_NAME'


cos_client = ibm_boto3.client(service_name='s3',
    ibm_api_key_id=COS_API_KEY_ID,
    ibm_service_instance_id = RESOURCE_INSTANCE_ID,
    config=Config(signature_version='oauth'),
    endpoint_url=ENDPOINT_URL)

body = cos_client.get_object(Bucket=IBM_COS_BUCKET,Key=IBM_COS_FILE_KEY)['Body']

    
separator = "\t" # separator used in csv.
    
test_df = skills_util.process_test_set(body, lang_util, delim=separator, cos=True)
display(Markdown("### Random Test Sample"))
display(HTML(test_df.sample(n=min(10, len(test_df))).to_html(index=False)))


THREAD_NUM = min(4, os.cpu_count() if os.cpu_count() else 1)
# increase timeout if you experience `TimeoutError`. 
# Increasing the `TIMEOUT` allows the process more breathing room to compete
TIMEOUT = 1 # `TIMEOUT` is set to 1 second
full_results = inferencer.inference(conversation=conversation,
                                    test_data=test_df,
                                    max_thread=THREAD_NUM, 
                                    skill_id=skill_id,
                                    timeout=TIMEOUT
                                   )

100%|██████████| 53/53 [00:03<00:00, 16.62it/s]


summary_generator.generate_summary_statistics(test_df)
summary_generator.show_user_examples_per_intent(test_df)


divergence_analyzer.analyze_train_test_diff(workspace_pd, test_df, full_results)


results = full_results[['correct_intent', 'top_confidence','top_intent','utterance']]
accuracy = inferencer.calculate_accuracy(results)
display(Markdown("### Accuracy on Test Data: {} %".format(accuracy)))


wrongs_df = inferencer.calculate_mistakes(results)
display(Markdown("### Intent Detection Mistakes"))
display(Markdown("Number of Test Errors: {}".format(len(wrongs_df))))

with pd.option_context('max_colwidth', 250):
    if not wrongs_df.empty:
        display(wrongs_df)


analysis_df= confidence_analyzer.analysis(results,None)


analysis_df.index = np.arange(1, len(analysis_df)+1)
display(analysis_df)


# Calculate intent with most test examples
for label in list(test_df['intent'].value_counts().index):
    if label != skills_util.OFFTOPIC_LABEL:
        MOST_FREQUENT_INTENT = label 
        break
        
# Specify intents of interest for analysis      
INTENT_LIST = [MOST_FREQUENT_INTENT]  

analysis_df_list = confidence_analyzer.analysis(results, INTENT_LIST)


# Pick an example from section 1 which was misclassified
# Add the example and correct intent for the example
utterance = "where is the closest agent"  # input example
intent = "General_Connect_to_Agent"  # input an intent in your workspace which you are interested in.

# increase timeout if you experience `TimeoutError`. 
# Increasing the `TIMEOUT` allows the process more breathing room to compete
TIMEOUT = 1 # `TIMEOUT` is set to 1 second
inference_results = inferencer.inference(conversation=conversation, 
                                    skill_id=skill_id, 
                                    test_data=pd.DataFrame({'utterance':[utterance], 
                                                            'intent':[intent]}), 
                                    max_thread = 1,
                                    timeout=TIMEOUT
                                    )

highlighter.get_highlights_in_batch_multi_thread(conversation=conversation, 
                                                 full_results=inference_results, 
                                                 output_folder=None,
                                                 confidence_threshold=1,
                                                 show_worst_k=1,
                                                 lang_util=lang_util,
                                                 skill_id=skill_id,
                                                )

100%|██████████| 5/5 [00:00<00:00, 13.99it/s]


# The output folder for generated images
# Note modify this if you want the generated images to be stored in a different directory

highlighting_output_folder = './highlighting_images/'
if not os.path.exists(highlighting_output_folder):
    os.mkdir(highlighting_output_folder)

# The threshold the prediction needs to achieve below which  
# it will be considered as `out of domain` or `offtopic` utterances. 
threshold = 0.2

# Maximum number of test set examples whose highlighting analysis will be conducted
K=25
highlighter.get_highlights_in_batch_multi_thread(conversation=conversation, 
                                                 full_results=full_results, 
                                                 output_folder=highlighting_output_folder,
                                                 confidence_threshold=threshold,
                                                 show_worst_k=K,
                                                 lang_util=lang_util,
                                                 skill_id=skill_id,
                                                )

0it [00:00, ?it/s]


importlib.reload(confidence_analyzer)
correct_thresh, wrong_thresh = 0.3, 0.7
correct_with_low_conf_list, incorrect_with_high_conf_list = confidence_analyzer.abnormal_conf(
    full_results, correct_thresh, wrong_thresh)


if len(correct_with_low_conf_list) > 0:
    display(Markdown("#### Examples correctedly predicted with low confidence"))
    with pd.option_context('max_colwidth', 250):
        display(HTML(correct_with_low_conf_list.to_html(index=False)))


if len(incorrect_with_high_conf_list) > 0:
    display(Markdown("#### Examples incorrectedly predicted with high confidence"))
    with pd.option_context('max_colwidth', 250):
        display(HTML(incorrect_with_high_conf_list.to_html(index=False)))


if entities_list:
    THREAD_NUM = min(4, os.cpu_count() if os.cpu_count() else 1)
    # increase timeout if you experience `TimeoutError`. 
    # Increasing the `TIMEOUT` allows the process more breathing room to compete
    TIMEOUT = 1 # `TIMEOUT` is set to 1 second
    train_full_results = inferencer.inference(conversation=conversation,
                                              test_data=workspace_pd, 
                                              max_thread=THREAD_NUM,
                                              skill_id=skill_id,
                                              timeout=TIMEOUT
                                             )
    entity_label_correlation_df = entity_analyzer.entity_label_correlation_analysis(
        train_full_results, entities_list)
    with pd.option_context('display.max_colwidth', 200):
        entity_label_correlation_df.index = np.arange(1, len(entity_label_correlation_df) + 1)
        display(entity_label_correlation_df)
else:
    display(Markdown("### Target workspace has no entities."))

100%|██████████| 199/199 [00:13<00:00, 15.23it/s]

utterance	intent	tokens
i would like to speak to someone	General_Connect_to_Agent	[i, would, like, to, speak, to, someon]
send me to an agent	General_Connect_to_Agent	[send, me, to, an, agent]
are stores open on sunday	Customer_Care_Store_Hours	[are, store, open, on, sunday]
i d like to go to a store	Customer_Care_Store_Location	[i, d, like, to, go, to, a, store]
how are you today	General_Greetings	[how, are, you, today]
what time does the central manchester store shut on a saturday	Customer_Care_Store_Hours	[what, time, doe, the, central, manchest, store, shut, on, a, saturday]
want to change my visit	Customer_Care_Appointments	[want, to, chang, my, visit]
see ya	Goodbye	[see, ya]
what is your location	Customer_Care_Store_Location	[what, is, your, locat]
can i connect to an agent	General_Connect_to_Agent	[can, i, connect, to, an, agent]

	Data Characteristic	Value
1	Total User Examples	199
2	Unique Intents	9
3	Average User Examples per Intent	22
4	Standard Deviation from Average	16
5	Total Number of Entities	9

	Intent	Correlated Unigrams	Correlated Bigrams
1	Customer_Care_Store_Hours	store, close, hour, are, open	what time, what are, you close, store open, you open
2	General_Connect_to_Agent	pleas, want, talk, speak, agent	do not, speak human, want speak, connect me, want talk
3	General_Greetings	hi, been, hello, good, hey	hey you, have you, you been, hey there, how are
4	Customer_Care_Store_Location	give, direct, find, where, locat	do get, find store, get your, where are, how do
5	Customer_Care_Appointments	visit, meet, face, make, appoint	like discuss, d like, like make, face face, make appoint
6	Help	me, assist, decid, say, help	need assist, what do, what say, you help, help me
7	Thanks	mani, nice, much, appreci, thank	you veri, much appreci, mani thank, appreci it, thank you
8	Cancel	request, tabl, anymor, cancel, mind	forget it, cancel that, cancel request, tabl anymor, anymor anymor
9	Goodbye	see, arrivederci, ciao, ya, bye	good bye, see ya, so long

utterance	intent	tokens
hi advisor	General_Greetings	[hi, advisor]
can i connect to an agent	General_Connect_to_Agent	[can, i, connect, to, an, agent]
thank you	Thanks	[thank, you]
can you arrange for me to meet at your closest store	Customer_Care_Appointments	[can, you, arrang, for, me, to, meet, at, your, closest, store]
please connect me to a live agent	General_Connect_to_Agent	[pleas, connect, me, to, a, live, agent]
hey there	General_Greetings	[hey, there]
are you open on sundays and if so what are the hours	Customer_Care_Store_Hours	[are, you, open, on, sunday, and, if, so, what, are, the, hour]
i want to know about a store	Customer_Care_Store_Location	[i, want, to, know, about, a, store]
i want to speak to a human	General_Connect_to_Agent	[i, want, to, speak, to, a, human]
hey twin	General_Greetings	[hey, twin]

	Data Characteristic	Value
1	Total User Examples	53
2	Unique Intents	10
3	Average User Examples per Intent	5
4	Standard Deviation from Average	3
5	Total Number of Entities	0

	Intent	% of Train	% of Test	Absolute Difference %	Train Examples	Test Examples	Test Precision %	Test Recall %	Test F1 %
0	Customer_Care_Store_Hours	24.120000	16.980000	7.140000	48	9	100.000000	100.000000	100.000000
1	General_Connect_to_Agent	23.620000	18.870000	4.750000	47	10	90.910000	100.000000	95.240000
3	Customer_Care_Store_Location	12.560000	9.430000	3.130000	25	5	62.500000	100.000000	76.920000
8	Goodbye	3.020000	5.660000	2.650000	6	3	100.000000	100.000000	100.000000
7	Cancel	3.520000	5.660000	2.140000	7	3	100.000000	100.000000	100.000000
2	General_Greetings	15.080000	13.210000	1.870000	30	7	87.500000	100.000000	93.330000
4	Customer_Care_Appointments	10.050000	9.430000	0.620000	20	5	100.000000	100.000000	100.000000
5	Help	4.020000	3.770000	0.250000	8	2	50.000000	100.000000	66.670000
6	Thanks	4.020000	3.770000	0.250000	8	2	100.000000	100.000000	100.000000

	correct_intent	top_confidence	top_intent	utterance
Test Example Index
46	SYSTEM_OUT_OF_DOMAIN	0.079831	Help	can you tell me a good joke
47	SYSTEM_OUT_OF_DOMAIN	0.140559	Customer_Care_Store_Location	what is your iq
48	SYSTEM_OUT_OF_DOMAIN	0.145882	General_Greetings	luke i am your father
49	SYSTEM_OUT_OF_DOMAIN	0.120711	Customer_Care_Store_Location	where did betty buy her butter
50	SYSTEM_OUT_OF_DOMAIN	0.007102	General_Connect_to_Agent	how many engineers does it take to change a lightbulb
51	SYSTEM_OUT_OF_DOMAIN	0.039089	Help	can you help me change my account password
52	SYSTEM_OUT_OF_DOMAIN	0.106361	Customer_Care_Store_Location	what is a way to change my account address

	Threshold (T)	Ontopic Accuracy (TOA)	Bot Coverage %	Bot Coverage Counts	False Acceptance Rate (FAR)
1	0.0	100.0	100.000000	53 / 53	100.000000
2	0.1	100.0	94.339623	50 / 53	57.142857
3	0.2	100.0	88.679245	47 / 53	14.285714
4	0.3	100.0	86.792453	46 / 53	0.000000
5	0.4	100.0	86.792453	46 / 53	0.000000
6	0.5	100.0	86.792453	46 / 53	0.000000
7	0.6	100.0	84.905660	45 / 53	0.000000
8	0.7	100.0	84.905660	45 / 53	0.000000
9	0.8	100.0	83.018868	44 / 53	0.000000
10	0.9	100.0	81.132075	43 / 53	0.000000

	Threshold (T)	Ontopic Accuracy (TOA)	Bot Coverage %	Bot Coverage Counts
1	0.0	100.0	100.000000	11 / 11
2	0.1	100.0	100.000000	11 / 11
3	0.2	100.0	100.000000	11 / 11
4	0.3	100.0	100.000000	11 / 11
5	0.4	100.0	100.000000	11 / 11
6	0.5	100.0	90.909091	10 / 11
7	0.6	100.0	90.909091	10 / 11
8	0.7	100.0	90.909091	10 / 11
9	0.8	100.0	90.909091	10 / 11
10	0.9	100.0	90.909091	10 / 11

	Characteristic	Value
1	Test Set Index	0
2	Utterance	where is the closest agent
3	Actual Intent	General_Connect_to_Agent
4	Predicted Intent	General_Connect_to_Agent
5	Confidence	1

	Intent	Correlated Entities
1	Customer_Care_Store_Location	landmark
2	General_Connect_to_Agent	sys-date, reply
3	Customer_Care_Store_Hours	sys-date, holiday, reply
4	Customer_Care_Appointments	sys-number

Entity
phone
reply
sys-number
sys-date
zip_code
sys-time
holiday
specialist
landmark

Dialog Skill Analysis for Classic Watson Assistant¶

Introduction¶

Environment¶

Usage¶

Link to Source Repository¶

Table of Contents¶

Part 1 : Training Data Analysis¶

Setup: Access Training Data¶

Sample of Utterances & Intents¶

Sample of Entities¶

1.1 Process Dialog Skill Training Data¶

Summary Statistics¶

1.2 Data Distribution Analysis¶

Class Imbalance Analysis¶

Class Imbalance Detected ¶

Distribution of User Examples by Intent¶

Sorted Distribution of User Examples per Intent¶

Sorted Distribution of User Examples per Intent¶

Actions for Class Imbalance¶

1.3 Term Analysis - Correlation Analysis¶

Retrieve the most correlated unigrams and bigrams for each intent¶

Chi-squared Analysis¶

Actions for Anomalous Correlations¶

1.4 Term Analysis - Heat Map¶

Token Frequency per Intent ¶

Term Analysis for Custom Intent List¶

Token Frequency per Intent ¶

Actions for Anomalous Terms in Heat Map¶

1.5 Ambiguity in Training Data¶

Uncover possibly ambiguous terms based on feature correlation¶

A. Top Intent Pairs whose correlated unigrams overlap¶

B. Top Intent Pairs whose correlated bigrams overlap¶

C. Overlap Checker for Specific Intents¶

Uncover ambiguous utterances across intents¶

There are no similar utterances within different Intent¶

Actions for Ambiguity in Training Data¶

Part 2: Model Analysis¶

Setup: Upload Test Data¶

Random Test Sample¶

Evaluate Test Data¶

2.1 Model Analysis¶

Test Data Overview¶

Summary Statistics¶

Sorted Distribution of User Examples per Intent¶

Compare Test Data & Training Data¶

Test Data Evaluation¶

Data Distribution Divergence Test vs Train 10.0%¶

Test Data Example Length¶

Vocabulary Size Test vs Train¶

Determine Overall Accuracy on Test Set¶

Accuracy on Test Data: 86.79 %¶

Error Analysis¶

Intent Detection Mistakes¶

Part 3: Advanced Analysis¶

3.1 Analysis using Confidence Thresholds¶

Threshold Metrics¶

1) Thresholded On Topic Accuracy (TOA)¶

2) Bot Coverage %¶

3) False Acceptance Rate for Out of Domain Examples (FAR)¶

Note: Default acceptance threshold for Watson Assistant is set at 0.2. Utterances with top intent confidence < 0.2 will be considered irrelevant¶

Analysis Interpretation @ Confidence Level T¶

Threshold Selection¶

Threshold Selection on Individual Intents¶

Threshold Analysis for Intent: General_Connect_to_Agent¶

3.2 Term Importance Highlighting¶

Identified 1 problematic utterances¶

Identified 0 problematic utterances¶

3.3 Abnormal Confidence Analysis¶

Actions for abnormal confidence examples¶

3.4 Analysis using Correlated Entities per Intent¶

Part 4: Summary:¶

Glossary¶

Authors¶

Sorted Distribution of User Examples per Intent
¶

Token Frequency per Intent
¶

Token Frequency per Intent
¶