-
Notifications
You must be signed in to change notification settings - Fork 98
Open
Description
I'm tryting to train a model with my dataset, However I get negative vae loss, it seems quite strange.
Could you help me with this? Thanks!
vae training log with factor rot:
/usr/local/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:523: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/usr/local/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:524: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/usr/local/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/usr/local/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/usr/local/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/usr/local/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:532: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
Namespace(batch_size=64, cross_entropy_loss=False, datapath='../datasets/data/coeff', epochs=600, factor='rot', gpu=0, lr=0.0001, lr_epochs=150, lr_fac=0.5, output_path='./weights', root_folder='.', val=False, write_iteration=600)
46975
2022-04-14 10:43:10.860388: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2022-04-14 10:43:11.042504: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
name: NVIDIA Tesla P40 major: 6 minor: 1 memoryClockRate(GHz): 1.531
pciBusID: 0000:08:00.0
totalMemory: 23.88GiB freeMemory: 22.99GiB
2022-04-14 10:43:11.042551: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2022-04-14 10:43:11.427630: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-04-14 10:43:11.427679: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0
2022-04-14 10:43:11.427686: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N
2022-04-14 10:43:11.427814: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22300 MB memory) -> physical GPU (device: 0, name: NVIDIA Tesla P40, pci bus id: 0000:08:00.0, compute capability: 6.1)
Date: 2022-04-14 10:43:26 Epoch: [Stage 1][0/600] Loss: 2.7066.
Date: 2022-04-14 10:43:40 Epoch: [Stage 1][1/600] Loss: 2.4838.
Date: 2022-04-14 10:43:55 Epoch: [Stage 1][2/600] Loss: 2.2725.
Date: 2022-04-14 10:44:10 Epoch: [Stage 1][3/600] Loss: 2.0638.
Date: 2022-04-14 10:44:24 Epoch: [Stage 1][4/600] Loss: 1.8573.
Date: 2022-04-14 10:44:39 Epoch: [Stage 1][5/600] Loss: 1.6532.
Date: 2022-04-14 10:44:54 Epoch: [Stage 1][6/600] Loss: 1.4517.
Date: 2022-04-14 10:45:09 Epoch: [Stage 1][7/600] Loss: 1.2532.
Date: 2022-04-14 10:45:24 Epoch: [Stage 1][8/600] Loss: 1.0580.
Date: 2022-04-14 10:45:39 Epoch: [Stage 1][9/600] Loss: 0.8668.
Date: 2022-04-14 10:45:54 Epoch: [Stage 1][10/600] Loss: 0.6800.
Date: 2022-04-14 10:46:08 Epoch: [Stage 1][11/600] Loss: 0.4984.
Date: 2022-04-14 10:46:23 Epoch: [Stage 1][12/600] Loss: 0.3226.
Date: 2022-04-14 10:46:38 Epoch: [Stage 1][13/600] Loss: 0.1537.
Date: 2022-04-14 10:46:53 Epoch: [Stage 1][14/600] Loss: -0.0077.
Date: 2022-04-14 10:47:07 Epoch: [Stage 1][15/600] Loss: -0.1606.
Date: 2022-04-14 10:47:22 Epoch: [Stage 1][16/600] Loss: -0.3031.
Date: 2022-04-14 10:47:37 Epoch: [Stage 1][17/600] Loss: -0.4346.
Date: 2022-04-14 10:47:52 Epoch: [Stage 1][18/600] Loss: -0.5539.
Date: 2022-04-14 10:48:07 Epoch: [Stage 1][19/600] Loss: -0.6600.
Date: 2022-04-14 10:48:21 Epoch: [Stage 1][20/600] Loss: -0.7503.
Date: 2022-04-14 10:48:37 Epoch: [Stage 1][21/600] Loss: -0.8247.
Date: 2022-04-14 10:48:51 Epoch: [Stage 1][22/600] Loss: -0.8824.
Date: 2022-04-14 10:49:06 Epoch: [Stage 1][23/600] Loss: -0.9245.
Date: 2022-04-14 10:49:20 Epoch: [Stage 1][24/600] Loss: -0.9519.
Date: 2022-04-14 10:49:35 Epoch: [Stage 1][25/600] Loss: -0.9678.
Date: 2022-04-14 10:49:50 Epoch: [Stage 1][26/600] Loss: -0.9839.
Date: 2022-04-14 10:50:05 Epoch: [Stage 1][27/600] Loss: -1.0732.
Date: 2022-04-14 10:50:20 Epoch: [Stage 1][28/600] Loss: -1.2186.
Date: 2022-04-14 10:50:34 Epoch: [Stage 1][29/600] Loss: -1.2832.
Date: 2022-04-14 10:50:49 Epoch: [Stage 1][30/600] Loss: -1.3243.
Date: 2022-04-14 10:51:04 Epoch: [Stage 1][31/600] Loss: -1.3485.
Date: 2022-04-14 10:51:19 Epoch: [Stage 1][32/600] Loss: -1.3644.
Date: 2022-04-14 10:51:34 Epoch: [Stage 1][33/600] Loss: -1.3820.
Date: 2022-04-14 10:51:49 Epoch: [Stage 1][34/600] Loss: -1.3819.
vae training log with factor gamma:
root@train-disco3-0:/data1/DiscoFaceGAN/vae# python demo.py --datapath ../datasets/data/coeff --factor gamma
/usr/local/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:523: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/usr/local/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:524: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/usr/local/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/usr/local/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/usr/local/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/usr/local/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:532: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
Namespace(batch_size=64, cross_entropy_loss=False, datapath='../datasets/data/coeff', epochs=600, factor='gamma', gpu=0, lr=0.0001, lr_epochs=150, lr_fac=0.5, output_path='./weights', root_folder='.', val=False, write_iteration=600)
46975
2022-04-14 10:42:52.898209: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2022-04-14 10:42:53.063619: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
name: NVIDIA Tesla P40 major: 6 minor: 1 memoryClockRate(GHz): 1.531
pciBusID: 0000:08:00.0
totalMemory: 23.88GiB freeMemory: 23.22GiB
2022-04-14 10:42:53.063673: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2022-04-14 10:42:53.419503: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-04-14 10:42:53.419554: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0
2022-04-14 10:42:53.419561: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N
2022-04-14 10:42:53.419681: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22532 MB memory) -> physical GPU (device: 0, name: NVIDIA Tesla P40, pci bus id: 0000:08:00.0, compute capability: 6.1)
Date: 2022-04-14 10:43:07 Epoch: [Stage 1][0/600] Loss: 23.9197.
Date: 2022-04-14 10:43:22 Epoch: [Stage 1][1/600] Loss: 21.9015.
Date: 2022-04-14 10:43:36 Epoch: [Stage 1][2/600] Loss: 19.9280.
Date: 2022-04-14 10:43:50 Epoch: [Stage 1][3/600] Loss: 17.9590.
Date: 2022-04-14 10:44:04 Epoch: [Stage 1][4/600] Loss: 15.9933.
Date: 2022-04-14 10:44:18 Epoch: [Stage 1][5/600] Loss: 14.0305.
Date: 2022-04-14 10:44:32 Epoch: [Stage 1][6/600] Loss: 12.0710.
Date: 2022-04-14 10:44:46 Epoch: [Stage 1][7/600] Loss: 10.1152.
Date: 2022-04-14 10:45:01 Epoch: [Stage 1][8/600] Loss: 8.1635.
Date: 2022-04-14 10:45:15 Epoch: [Stage 1][9/600] Loss: 6.2167.
Date: 2022-04-14 10:45:29 Epoch: [Stage 1][10/600] Loss: 4.2754.
Date: 2022-04-14 10:45:44 Epoch: [Stage 1][11/600] Loss: 2.3405.
Date: 2022-04-14 10:45:58 Epoch: [Stage 1][12/600] Loss: 0.4129.
Date: 2022-04-14 10:46:12 Epoch: [Stage 1][13/600] Loss: -1.5061.
Date: 2022-04-14 10:46:27 Epoch: [Stage 1][14/600] Loss: -3.4153.
Date: 2022-04-14 10:46:41 Epoch: [Stage 1][15/600] Loss: -5.3130.
Date: 2022-04-14 10:46:55 Epoch: [Stage 1][16/600] Loss: -7.1975.
Date: 2022-04-14 10:47:10 Epoch: [Stage 1][17/600] Loss: -9.0669.
Date: 2022-04-14 10:47:23 Epoch: [Stage 1][18/600] Loss: -10.9186.
Date: 2022-04-14 10:47:38 Epoch: [Stage 1][19/600] Loss: -12.7502.
Date: 2022-04-14 10:47:52 Epoch: [Stage 1][20/600] Loss: -14.5583.
Date: 2022-04-14 10:48:07 Epoch: [Stage 1][21/600] Loss: -16.3394.
Date: 2022-04-14 10:48:21 Epoch: [Stage 1][22/600] Loss: -18.0895.
Date: 2022-04-14 10:48:36 Epoch: [Stage 1][23/600] Loss: -19.8038.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels