Skip to content

Commit 024cbc4

Browse files
authored
fix typos in creating new wiki based qna (#24)
Signed-off-by: Tomas Kral <tkral@redhat.com>
1 parent 547e3cf commit 024cbc4

File tree

1 file changed

+21
-16
lines changed

1 file changed

+21
-16
lines changed
Lines changed: 21 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,57 +1,62 @@
11
In this tutorial we will walk you through building out a new `qna.yaml` for adding new or updated knowledge to the `granite-7b-lab` model. Let's get started!
22

3+
The first thing we need to do is create a new directory to have a clean place to work and pull down some software. Most of the time, the easiest thing to update in the model is the Wikipedia entry, so luckily, `erictherobot` has written a helpful tool to pull down markdown versions of the articles for us.
4+
35
```bash
46
mkdir instructlab
7+
cd instructlab
58
git clone git@github.com:erictherobot/wikipedia-markdown-generator.git
69
```
7-
The first thing we need to do is create a new directory to have a clean place to work and pull down some software. Most of the time, the easiest thing to update in the model is the Wikipedia entry, so luckily, `erictherobot` has written a helpful tool to pull down markdown versions of the articles for us.
10+
11+
After this, clone down your instructlab knowledge docs repository. It can be named whatever you'd like, but if you use our https://ui.instructlab.ai, you'll notice you already have `instructlab-knowledge-docs`.
812

913
```bash
1014
git clone git@github.com:<USERNAME>/instructlab-knowledge-docs.git
1115
```
1216

13-
After this, clone down your instructlab knowledge docs repository. It can be named whatever you'd like, but if you use our https://ui.instructlab.ai, you'll notice you already have `instructlab-knowledge-docs`.
17+
Next, we need to build a Python virtual environment and install the dependencies to get it to work. These commands cd into the directory, create the virtual environment with python3.11 (you may need to change the version of Python on your machine), activate the virtual environment, and then do the pip install the dependencies.
18+
You'll notice the `Texas_Longhorns_football` there, a Wikipedia article I wanted to pull down and create the `qna.yaml` against. You should choose whatever new knowledge you want to do here.
1419

1520
```bash
1621
cd wikipedia-markdown-generator
1722
python3.11 -m venv venv-md-gen
1823
source venv-md-gen/bin/activate
19-
pip install -r requirements
24+
pip install -r requirements.txt
2025
python3 wiki-to-md.py Texas_Longhorns_football
2126
```
22-
Next, we need to build a Python virtual environment and install the dependencies to get it to work. These commands cd into the directory, create the virtual environment with python3.11 (you may need to change the version of Python on your machine), activate the virtual environment, and then do the pip install the dependencies.
23-
You'll notice the `Texas_Longhorns_football` there, a Wikipedia article I wanted to pull down and create the `qna.yaml` against. You should choose whatever new knowledge you want to do here.
27+
28+
Next, we go ahead and copy the markdown into the knowledge repository, and commit it to our repository and push it up to GitHub.
2429

2530
```bash
26-
cp md_output/Texas_Longhorns_football.md ../../instructlab-knowledge-docs/
27-
cd ../../instructlab-knowledge-docs
31+
cp md_output/Texas_Longhorns_football.md ../instructlab-knowledge-docs/
32+
cd ../instructlab-knowledge-docs
2833
git add .
2934
git commit -m "added markdown doc"
3035
git push origin main
3136
cd ..
3237
```
3338

34-
Next, we go ahead and copy the markdown into the knowledge repository, and commit it to our repository and push it up to GitHub.
39+
Next we pull down the upstream public taxonomy directory, and `cd` into that directory.
3540

36-
```
37-
git clone git@github.com/instructlab/taxonomy
41+
```bash
42+
git clone git@github.com:instructlab/taxonomy
3843
cd taxonomy
3944
```
4045

41-
Next we pull down the upstream public taxonomy directory, and `cd` into that directory.
46+
This next step is a "best effort" for you. As the taxonomy grows, there will be some obvious choices, but if you select a tree that hasn't been flushed out yet, you'll have to do your best to think about where you'd find the `qna.yaml`. In this case, the Dewey Decimal System says sports should be under arts; this is American Football, college level with the University of Texas. Also, notice the underscores for the spaces; this is important.
4247

4348
```bash
44-
mkdir -p arts/sports/american_football/college/university_of_texas/
49+
mkdir -p knowledge/arts/sports/american_football/college/university_of_texas/
4550
```
46-
This next step is a "best effort" for you. As the taxonomy grows, there will be some obvious choices, but if you select a tree that hasn't been flushed out yet, you'll have to do your best to think about where you'd find the `qna.yaml`. In this case, the Dewey Decimal System says sports should be under arts; this is American Football, college level with the University of Texas. Also, notice the underscores for the spaces; this is important.
51+
52+
Finally, you can pull down the `template_qna.yaml` and fill it out for the needed questions and answers. Be sure to put the context at a maximum of about 500 Tokens and questions and answers around 250 Tokens.
4753

4854
```bash
4955
wget https://raw.githubusercontent.com/instructlab/taxonomy/main/docs/template_qna.yaml
50-
mv template_qna.yaml sports/american_football/college/university_of_texas/qna.yaml
56+
mv template_qna.yaml knowledge/arts/sports/american_football/college/university_of_texas/qna.yaml
5157
```
52-
Finally, you can pull down the `template_qna.yaml` and fill it out for the needed questions and answers. Be sure to put the context at a maximum of about 500 Tokens and questions and answers around 250 Tokens.
5358

5459
```
55-
vim sports/american_football/college/university_of_texas/qna.yaml
60+
vim knowledge/arts/sports/american_football/college/university_of_texas/qna.yaml
5661
```
5762

0 commit comments

Comments
 (0)