Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
27c252f
update AL test agent and enable it in config.yaml
ventselartur Dec 19, 2025
e61a576
shorten the dataset in my private branch with only items fail with AL…
ventselartur Dec 22, 2025
6fab1ee
agent without libraries and al mcp
AleksanderGladkov Dec 22, 2025
924de54
al test minimal
AleksanderGladkov Dec 22, 2025
c902dd2
revert bcbench.jsonl to get full dataset
ventselartur Dec 23, 2025
e6f8731
update ALTestMinimal
ventselartur Dec 24, 2025
1712667
merge from main
ventselartur Dec 24, 2025
ec41ee4
improve instructions with LLM
ventselartur Dec 24, 2025
6da5dc0
improve instructions with LLM
ventselartur Dec 24, 2025
99ca30e
Get-WorkFlowSummary.ps1 baseline
ventselartur Dec 25, 2025
665594f
cherry pick Sasha additional logging
ventselartur Dec 25, 2025
8430964
keep only first-party apps in dataset/bcbench.jsonl
ventselartur Dec 26, 2025
d9009b4
merge from main
ventselartur Dec 26, 2025
a542317
fix scripts which accept only dataset entries with tests
ventselartur Dec 26, 2025
c9903b4
add NAV Prs with first-party apps
ventselartur Dec 26, 2025
f3d16dd
disable the agent on config.yaml
ventselartur Dec 26, 2025
a79e3ee
remove failed PR from the dataset and add area for all new entries
ventselartur Dec 26, 2025
0ec3e3c
remove entries from dataset where app cannot compile with the fix
ventselartur Dec 26, 2025
313d0bb
introduce one more version of the agent with special hints for handle…
ventselartur Dec 26, 2025
41ed3a0
merge from master
ventselartur Jan 6, 2026
8ff082f
update
ventselartur Jan 6, 2026
b1fa5ff
remove two commits which do not work for BC bench
ventselartur Jan 6, 2026
5cdc3f7
revert changes in bcbench.jsonl to get the original one
ventselartur Jan 6, 2026
ee3f7f9
update instructions for AL test agent
ventselartur Jan 7, 2026
92829d4
merge from main
ventselartur Jan 7, 2026
009badb
merge from main
ventselartur Jan 19, 2026
3be7aff
revert changes to Python scripts
ventselartur Jan 19, 2026
806c343
remove unnecessary problem statements
ventselartur Jan 19, 2026
70425c4
revert changes to Python scripts
ventselartur Jan 19, 2026
cfe0324
revert change to scripts/AppUtils.psm1
ventselartur Jan 19, 2026
665ed66
revert change to scripts/AppUtils.psm1
ventselartur Jan 19, 2026
ad507ca
merge from main
ventselartur Jan 19, 2026
a24b4c1
merge from main
ventselartur Jan 20, 2026
5c80fc1
Merge branch 'main' of https://github.com/microsoft/BC-Bench into pr/…
haoranpb Jan 21, 2026
2ffa172
update
ventselartur Jan 22, 2026
28d01ea
merge from main
ventselartur Jan 22, 2026
e38f71c
update Agent name and disable it
ventselartur Jan 22, 2026
e9bb510
merge from master
ventselartur Feb 3, 2026
5bb8cc2
revert innecessary changes to AL test agent
ventselartur Feb 3, 2026
aaee7d6
switch test generation input to problem-statement
ventselartur Feb 3, 2026
a12b815
enable the agent
ventselartur Feb 4, 2026
1134d5d
merge from master
ventselartur Feb 20, 2026
5b029b9
merge from master
ventselartur Mar 2, 2026
06f3768
set new ALTest agent
ventselartur Mar 2, 2026
975912d
init commit - Arturs's changes
Mar 13, 2026
3ebf350
test workflow
Mar 13, 2026
f2c06fd
scripts
Mar 25, 2026
025d3b4
agent test changes
Mar 25, 2026
dd3b8b5
Merge branch 'main' into private/milicadjukic/BCBenchScript
Mar 25, 2026
212469c
Merge branch 'private/milicadjukic/BCBenchScript' into private/milica…
Mar 25, 2026
94497fe
new + evaluations
Mar 30, 2026
dfd061a
Merge branch 'main' into private/milicadjukic/BCBenchScript2
Mar 30, 2026
5b27449
removed grouping
Apr 1, 2026
68094ca
group_errrors_from_summary script
Apr 1, 2026
be7b687
out2 files with get-workflowsummary.ps1
Apr 2, 2026
661edca
modified get-workflowsummary.ps1
Apr 2, 2026
e5cc118
skip canceled
Apr 6, 2026
afa0c43
Supports TWO input modes
Apr 6, 2026
25c1a71
remove artifacts
Apr 7, 2026
ef03144
removed artifacts2
Apr 7, 2026
f490eff
delete unused script
Apr 7, 2026
f14a39c
removed workflows
Apr 7, 2026
dfdf006
Merge branch 'main' into private/milicadjukic/BCBenchScript2
Apr 7, 2026
cec9215
scripts updated
Apr 8, 2026
3d38770
revert changes for al test agent
Apr 9, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Loading