Skip to content

Commit c6ac5ad

Browse files
authored
Update README.md
1 parent 7874f60 commit c6ac5ad

1 file changed

Lines changed: 2 additions & 1 deletion

File tree

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
11
# Discriminator-Guided Chain-of-Thought Reasoning
2-
[[**Paper**]](https://arxiv.org/abs/2305.14934), [[**Website**]](https://mukhal.github.io/grace/)
32

43
![image](https://github.com/mukhal/grace-decoding/assets/5109053/cdb93474-1613-47d8-9bf4-be2ae3086979)
54

@@ -30,6 +29,8 @@ WANDB_MODE=disabled python sample_negative_solutions.py --in_file data/$TASK/tra
3029
All parameters are self-explanatory, but `--sample_calc` means we will use calculator sampling. That is whenever an operation such as `<< 4 + 5=9 >>` is generated, we will invoke a calculator module to compute the result.
3130

3231
### Steps 2 and 3: Alignment and Discriminator Training
32+
![image](https://github.com/mukhal/grace/assets/5109053/ebbefdc2-0861-4fbc-ad0f-43316741bf58)
33+
3334
Now we want to train a FLAN-T5 encoder as a discriminator over the sampled solutions.
3435
```
3536
accelerate launch --mixed_precision=bf16 --num_processes=$GPUS_PER_NODE train_discriminator.py --task gsm8k \

0 commit comments

Comments
 (0)