Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 28 additions & 21 deletions python/paddle_fl/mpc/examples/linear_reg_with_uci/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,16 @@ This document introduces how to run UCI Housing demo based on Paddle-MPC, which

#### (1). Prepare Data

Generate encrypted data utilizing `generate_encrypted_data()` in `process_data.py` script. For example, users can write the following code into a python script named `prepare.py`, and then run the script with command `python prepare.py`.
Generate encrypted data utilizing `generate_encrypted_data()` in `process_data.py` script. For example, users can write the following code into a python script named `prepare.py`, and then run the script with command `python3 prepare.py aby3` or `python3 prepare.py privc`. Users can choose to use ABY3 protocol or PrivC protocol according to the needs of parties.

```python
import sys
import process_data
process_data.protocol = sys.argv[1]
process_data.generate_encrypted_data()
```

Encrypted data files of feature and label would be generated and saved in `/tmp` directory. Different suffix names are used for these files to indicate the ownership of different computation parties. For instance, a file named `house_feature.part0` means it is a feature file of party 0.
Encrypted data files of feature and label would be generated and saved in `./mpc_data` directory. Different suffix names are used for these files to indicate the ownership of different computation parties. For instance, a file named `house_feature.part0` means it is a feature file of party 0.

#### (2). Launch Demo with A Shell Script

Expand All @@ -28,21 +30,24 @@ export LOCALHOST=/your/localhost
export REDIS_PORT=/your/redis/port
```

Launch demo with the `run_standalone.sh` script. The concrete command is:
Launch demo with the `run_standalone_*.sh` script. The concrete command is:

```bash
bash run_standalone.sh uci_demo.py
`if ABY3`
bash run_standalone_aby3.sh uci_demo.py
`if PrivC`
bash run_standalone_privc.sh uci_demo.py
```

The loss with cypher text format will be displayed on screen while training. At the same time, the loss data would be also save in `/tmp` directory, and the format of file name is similar to what is described in Step 1.
The loss with cypher text format will be displayed on screen while training. At the same time, the loss data would be also save in `./mpc_infer_data/` directory, and the format of file name is similar to what is described in Step 1.

Besides, predictions would be made in this demo once training is finished. The predictions with cypher text format would also be save in `/tmp` directory.
Besides, predictions would be made in this demo once training is finished. The predictions with cypher text format would also be save in `./mpc_infer_data/` directory.

#### (3). Decrypt Data

Finally, using `load_decrypt_data()` in `process_data.py` script, this demo would decrypt and print the loss and predictions, which can be compared with related results of Paddle plain text model.

For example, users can write the following code into a python script named `decrypt_save.py`, and then run the script with command `python decrypt_save.py decrypt_loss_file decrypt_prediction_file`. The decrypted loss and prediction results would be saved into two files correspondingly.
For example, users can write the following code into a python script named `decrypt_save.py`, and then run the script with command `python3 decrypt_save.py decrypt_loss_file decrypt_prediction_file aby3` or `python3 decrypt_save.py decrypt_loss_file decrypt_prediction_file privc`. The decrypted loss and prediction results would be saved into two files correspondingly.

```python
import sys
Expand All @@ -52,17 +57,18 @@ import process_data

decrypt_loss_file=sys.argv[1]
decrypt_prediction_file=sys.argv[2]
process_data.protocol = sys.argv[3]
BATCH_SIZE=10
process_data.load_decrypt_data("/tmp/uci_loss", (1, ), decrypt_loss_file)
process_data.load_decrypt_data("/tmp/uci_prediction", (BATCH_SIZE, ), decrypt_prediction_file)
process_data.load_decrypt_data("./mpc_infer_data/uci_loss", (1, ), decrypt_loss_file)
process_data.load_decrypt_data("./mpc_infer_data/uci_prediction", (BATCH_SIZE, ), decrypt_prediction_file)
```

**Note** that remember to delete the loss and prediction files in `/tmp` directory generated in last running, in case of any influence on the decrypted results of current running. For simplifying users operations, we provide the following commands in `run_standalone.sh`, which can delete the files mentioned above when running this script.
**Note** that remember to delete the loss and prediction files in `./mpc_infer_data` directory generated in last running, in case of any influence on the decrypted results of current running. For simplifying users operations, you can write the following commands in `run_standalone.sh`, which can delete the files mentioned above when running this script.

```bash
# remove temp data generated in last time
LOSS_FILE="/tmp/uci_loss.*"
PRED_FILE="/tmp/uci_prediction.*"
LOSS_FILE="./mpc_infer_data/uci_loss.*"
PRED_FILE="./mpc_infer_data/uci_prediction.*"
if [ "$LOSS_FILE" ]; then
rm -rf $LOSS_FILE
fi
Expand All @@ -82,7 +88,7 @@ Data owner encrypts data. Concrete operations are consistent with “Prepare Dat

#### (2). Distribute Encrypted Data

According to the suffix of file name, distribute encrypted data files to `/tmp ` directories of all 3 computation parties. For example, send `house_feature.part0` and `house_label.part0` to `/tmp` of party 0 with `scp` command.
According to the suffix of file name, distribute encrypted data files to `./mpc_data ` directories of all 3 computation parties. For example, send `house_feature.part0` and `house_label.part0` to `./mpc_data` of party 0 with `scp` command.

#### (3). Modify uci_demo.py

Expand All @@ -93,7 +99,7 @@ Each computation party makes the following modifications on `uci_demo.py` accord
Modify `localhost` in the following code as the IP address of the machine.

```python
pfl_mpc.init("aby3", int(role), "localhost", server, int(port))
pfl_mpc.init(mpc_protocol_name, int(role), "localhost", server, int(port))
```

#### (4). Launch Demo on Each Party
Expand All @@ -107,18 +113,18 @@ $REDIS_BIN -h $SERVER -p $PORT flushall
Launch demo on each computation party with the following command,

```
$PYTHON_EXECUTABLE uci_demo.py $PARTY_ID $SERVER $PORT
$PYTHON_EXECUTABLE uci_demo.py $PARTY_ID $SERVER $PORT $PROTOCOL
```

where PYTHON_EXECUTABLE is the python which installs PaddleFL, PARTY_ID is the ID of computation party, which is 0, 1, or 2, SERVER and PORT represent the IP and port of Redis server respectively.
where PYTHON_EXECUTABLE is the python which installs PaddleFL, PARTY_ID is the ID of computation party, which is 0, 1, or 2, SERVER and PORT represent the IP and port of Redis server respectively. PROTOCOL is the MPC protocol that users choose.

Similarly, training loss with cypher text format would be printed on the screen of each computation party. And at the same time, the loss and predictions would be saved in `/tmp` directory.
Similarly, training loss with cypher text format would be printed on the screen of each computation party. And at the same time, the loss and predictions would be saved in `./mpc_infer_data` directory.

**Note** that remember to delete the loss and prediction files in `/tmp` directory generated in last running, in case of any influence on the decrypted results of current running.
**Note** that remember to delete the loss and prediction files in `./mpc_infer_data` directory generated in last running, in case of any influence on the decrypted results of current running.

#### (5). Decrypt Loss and Prediction Data

Each computation party sends `uci_loss.part` and `uci_prediction.part` files in `/tmp` directory to the `/tmp` directory of data owner. Data owner decrypts and gets the plain text of loss and predictions with ` load_decrypt_data()` in `process_data.py`.
Each computation party sends `uci_loss.part` and `uci_prediction.part` files in `./mpc_infer_data` directory to the `./mpc_infer_data` directory of data owner. Data owner decrypts and gets the plain text of loss and predictions with ` load_decrypt_data()` in `process_data.py`.

For example, the following code can be written into a python script to decrypt and print training loss and predictions.

Expand All @@ -130,9 +136,10 @@ import process_data

decrypt_loss_file=sys.argv[1]
decrypt_prediction_file=sys.argv[2]
process_data.protocol = sys.argv[3]
BATCH_SIZE=10
process_data.load_decrypt_data("/tmp/uci_loss", (1, ), decrypt_loss_file)
process_data.load_decrypt_data("/tmp/uci_prediction", (BATCH_SIZE, ), decrypt_prediction_file)
process_data.load_decrypt_data("./mpc_infer_data/uci_loss", (1, ), decrypt_loss_file)
process_data.load_decrypt_data("./mpc_infer_data/uci_prediction", (BATCH_SIZE, ), decrypt_prediction_file)
```

### 3. Convergence of paddle_fl.mpc vs paddle
Expand Down
53 changes: 30 additions & 23 deletions python/paddle_fl/mpc/examples/linear_reg_with_uci/README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,16 @@

#### 1. 准备数据

使用`process_data.py`脚本中的`generate_encrypted_data()`产生加密数据,比如将如下内容写到一个`prepare.py`脚本中,然后`python prepare.py`
使用`process_data.py`脚本中的`generate_encrypted_data()`产生加密数据,比如将如下内容写到一个`prepare.py`脚本中,然后`python3 prepare.py aby3` 或者 `python3 prepare.py privc`,用户可根据多方计算实体数目需求选择采用ABY3协议或者PrivC协议。

```python
import sys
import process_data
process_data.protocol = sys.argv[1]
process_data.generate_encrypted_data()
```

将在/tmp目录下生成对应于3个计算party的feature和label的加密数据文件,以后缀名区分属于不同party的数据。比如,`house_feature.part0`表示属于party0的feature数据。
以ABY3协议为例,将在./mpc_data目录下生成对应于3个计算party的feature和label的加密数据文件,以后缀名区分属于不同party的数据。比如,`house_feature.part0`表示属于party0的feature数据。

#### 2. 使用shell脚本启动demo

Expand All @@ -28,20 +30,23 @@ export LOCALHOST=/your/localhost
export REDIS_PORT=/your/redis/port
```

然后使用`run_standalone.sh`脚本,启动并运行demo,命令如下:
然后使用`run_standalone_**.sh`脚本,启动并运行demo,命令如下:

```bash 
bash run_standalone.sh uci_demo.py
`若采用ABY3`
bash run_standalone_aby3.sh uci_demo.py
`若采用PrivC`
bash run_standalone_privc.sh uci_demo.py
```

运行之后将在屏幕上打印训练过程中的密文loss数据,同时,对应的密文loss数据将会保存到/tmp目录下的文件中,文件命名格式类似于步骤1中所述。
运行之后将在屏幕上打印训练过程中的密文loss数据,同时,对应的密文loss数据将会保存到./mpc_infer_data/目录下的文件中,文件命名格式类似于步骤1中所述。

此外,在完成训练之后,demo会继续进行预测,并将预测密文结果也保存到/tmp目录下的文件中
此外,在完成训练之后,demo会继续进行预测,并将预测密文结果也保存到./mpc_infer_data/目录下的文件中

#### 3. 解密数据

最后,demo会使用`process_data.py`脚本中的`load_decrypt_data()`,恢复并打印出明文的loss数据和prediction结果,用以和明文Paddle模型结果进行对比。
例如,将下面的内容写到一个decrypt_save.py脚本中,然后python decrypt_save.py decrypt_loss_file decrypt_prediction_file,将把明文losss数据和预测结果分别保存在文件中
例如,将下面的内容写到一个decrypt_save.py脚本中,然后python3 decrypt_save.py decrypt_loss_file decrypt_prediction_file aby3 或者 python3 decrypt_save.py decrypt_loss_file decrypt_prediction_file privc,将把明文loss数据和预测结果分别保存在文件中

```python
import sys
Expand All @@ -52,16 +57,17 @@ import process_data
decrypt_loss_file=sys.argv[1]
decrypt_prediction_file=sys.argv[2]
BATCH_SIZE=10
process_data.load_decrypt_data("/tmp/uci_loss", (1, ), decrypt_loss_file)
process_data.load_decrypt_data("/tmp/uci_prediction", (BATCH_SIZE, ), decrypt_prediction_file)
process_data.protocol = sys.argv[3]
process_data.load_decrypt_data("./mpc_infer_data/uci_loss", (1, ), decrypt_loss_file)
process_data.load_decrypt_data("./mpc_infer_data/uci_prediction", (BATCH_SIZE, ), decrypt_prediction_file)
```

**注意**:再次启动运行demo之前,请先将上次在`/tmp`保存的loss和prediction文件删除,以免影响本次密文数据的恢复结果。为了简化用户操作,我们在`run_standalone.sh`脚本中加入了如下的内容,可以在执行脚本时删除上次数据
**注意**:再次启动运行demo之前,请先将上次在`./mpc_infer_data`保存的loss和prediction文件删除,以免影响本次密文数据的恢复结果。为了简化用户操作,可以在`run_standalone.sh`脚本中加入如下的内容,在执行脚本时删除上次数据

```bash
# remove temp data generated in last time
LOSS_FILE="/tmp/uci_loss.*"
PRED_FILE="/tmp/uci_prediction.*"
LOSS_FILE="./mpc_infer_data/uci_loss.*"
PRED_FILE="./mpc_infer_data/uci_prediction.*"
if [ "$LOSS_FILE" ]; then
rm -rf $LOSS_FILE
fi
Expand All @@ -81,9 +87,9 @@ fi

#### 2. 分发数据

按照后缀名,将步骤1中准备好的数据分别发送到对应的计算party的/tmp目录下。比如,使用scp命令,将
按照后缀名,将步骤1中准备好的数据分别发送到对应的计算party的./mpc_data目录下。比如,使用scp命令,将

`house_feature.part0`和`house_label.part0`发送到party0的/tmp目录下
`house_feature.part0`和`house_label.part0`发送到party0的./mpc_data目录下

#### 3. 计算party修改uci_demo.py脚本

Expand All @@ -94,7 +100,7 @@ fi
将脚本如下内容中的`localhost`修改为自己的IP地址:

```python
pfl_mpc.init("aby3", int(role), "localhost", server, int(port))
pfl_mpc.init(mpc_protocol_name, int(role), "localhost", server, int(port))
```


Expand All @@ -109,20 +115,20 @@ $REDIS_BIN -h $SERVER -p $PORT flushall
在各计算party分别执行以下命令,启动demo:

```
$PYTHON_EXECUTABLE uci_demo.py $PARTY_ID $SERVER $PORT
$PYTHON_EXECUTABLE uci_demo.py $PARTY_ID $SERVER $PORT $PROTOCOL
```

其中,PYTHON_EXECUTABLE表示自己安装了PaddleFL的python,PARTY_ID表示计算party的编号,值为0、1或2,SERVER和PORT分别表示redis server的IP地址和端口号。
其中,PYTHON_EXECUTABLE表示自己安装了PaddleFL的python,PARTY_ID表示计算party的编号,值为0、1或2,SERVER和PORT分别表示redis server的IP地址和端口号,PROTOCOL表示采用的多方安全计算协议

同样地,运行之后将在各计算party的屏幕上打印训练过程中的密文loss数据。同时,对应的密文loss和prediction数据将会保存到`/tmp`目录下的文件中,文件命名格式类似于步骤1中所述。
同样地,运行之后将在各计算party的屏幕上打印训练过程中的密文loss数据。同时,对应的密文loss和prediction数据将会保存到`./mpc_infer_data`目录下的文件中,文件命名格式类似于步骤1中所述。

**注意**:再次启动运行demo之前,请先将上次在`/tmp`保存的loss和prediction文件删除,以免影响本次密文数据的恢复结果。
**注意**:再次启动运行demo之前,请先将上次在`./mpc_infer_data`保存的loss和prediction文件删除,以免影响本次密文数据的恢复结果。

#### 5. 数据方解密loss和prediction

各计算party将`/tmp`目录下的`uci_loss.part`和`uci_prediction.part`文件发送到数据方的/tmp目录下。数据方使用process_data.py脚本中的load_decrypt_data()解密恢复出loss数据和prediction数据。
各计算party将`./mpc_infer_data`目录下的`uci_loss.part`和`uci_prediction.part`文件发送到数据方的./mpc_infer_data目录下。数据方使用process_data.py脚本中的load_decrypt_data()解密恢复出loss数据和prediction数据。

例如,将下面的内容写到一个decrypt_save.py脚本中,然后python decrypt_save.py decrypt_loss_file decrypt_prediction_file,将把明文losss数据和预测结果分别保存在文件中。
例如,将下面的内容写到一个decrypt_save.py脚本中,然后python3 decrypt_save.py decrypt_loss_file decrypt_prediction_file aby3,将把明文losss数据和预测结果分别保存在文件中。

```python
import sys
Expand All @@ -132,9 +138,10 @@ import process_data

decrypt_loss_file=sys.argv[1]
decrypt_prediction_file=sys.argv[2]
process_data.protocol = sys.argv[3]
BATCH_SIZE=10
process_data.load_decrypt_data("/tmp/uci_loss", (1, ), decrypt_loss_file)
process_data.load_decrypt_data("/tmp/uci_prediction", (BATCH_SIZE, ), decrypt_prediction_file)
process_data.load_decrypt_data("./mpc_infer_data/uci_loss", (1, ), decrypt_loss_file)
process_data.load_decrypt_data("./mpc_infer_data/uci_prediction", (BATCH_SIZE, ), decrypt_prediction_file)
```

### 三. 单机精度测试
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@
decrypt_loss_file=sys.argv[1]
decrypt_prediction_file=sys.argv[2]
BATCH_SIZE=10

process_data.protocol = sys.argv[3]
process_data.load_decrypt_data("./mpc_infer_data/uci_loss", (1, ), decrypt_loss_file)
print("uci_loss done")
process_data.load_decrypt_data("./mpc_infer_data/uci_prediction", (BATCH_SIZE, ), decrypt_prediction_file)
Expand Down
4 changes: 3 additions & 1 deletion python/paddle_fl/mpc/examples/linear_reg_with_uci/prepare.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,9 @@
"""
Prepare data for UCI Housing.
"""
import process_data
import sys

import process_data

process_data.protocol = sys.argv[1]
process_data.generate_encrypted_data()
Loading