wip: [01-stabilize] paused at task 1/1 - OCR Hallucination Immune logic via Semantic delta window and fret-isolation

This commit is contained in:
2026-03-29 22:08:40 +09:00
parent aca7bf592a
commit 2507de45d3
4289 changed files with 732689 additions and 28672 deletions

1
.agent/vendor/mini-swe/docs/CNAME vendored Normal file
View File

@@ -0,0 +1 @@
swe-agent.com

10
.agent/vendor/mini-swe/docs/README.md vendored Normal file
View File

@@ -0,0 +1,10 @@
# SWE-agent
<div style="text-align: center;">
<img src="assets/readme_assets/swe-agent-banner.png" alt="SWE-agent banner" style="height: 12em;">
</div>
🔗 Simply want to read the docs? Please head to [the web version of these docs](https://swe-agent.com/latest/)
This folder holds the source for the SWE-agent documentation.
Want to modify and build the website locally? See [here](https://swe-agent.com/latest/dev/contribute#mkdocs).

21
.agent/vendor/mini-swe/docs/_footer.md vendored Normal file
View File

@@ -0,0 +1,21 @@
<div class="grid cards">
<a href="https://github.com/SWE-agent/SWE-agent/issues/new" class="nav-card-link">
<div class="nav-card">
<div class="nav-card-header">
<span class="material-icons nav-card-icon">bug_report</span>
<span class="nav-card-title">Something broken/unclear?</span>
</div>
<p class="nav-card-description">Open an issue on GitHub!</p>
</div>
</a>
<a href="https://join.slack.com/t/swe-bench/shared_invite/zt-36pj9bu5s-o3_yXPZbaH2wVnxnss1EkQ" class="nav-card-link">
<div class="nav-card">
<div class="nav-card-header">
<span class="material-icons nav-card-icon">help</span>
<span class="nav-card-title">Open-ended discussions</span>
</div>
<p class="nav-card-description">Join our Slack!</p>
</div>
</a>
</div>

Binary file not shown.

After

Width:  |  Height:  |  Size: 412 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.0 MiB

View File

@@ -0,0 +1,83 @@
/* Floating bubbles styles */
.floating-bubbles {
position: fixed;
bottom: 20px;
right: 20px;
display: flex;
flex-direction: column;
gap: 10px;
z-index: 1000;
}
.floating-bubbles-title {
position: absolute;
top: -30px;
right: 0;
font-size: 12px;
color: #777;
text-align: right;
font-weight: bold;
opacity: 0;
visibility: hidden;
transition:
opacity 0.3s ease,
visibility 0.3s ease;
white-space: nowrap;
}
.floating-bubbles:hover .floating-bubbles-title {
opacity: 1;
visibility: visible;
}
.bubble {
width: 40px;
height: 40px;
display: flex;
justify-content: center;
align-items: center;
position: relative;
transition: transform 0.3s ease;
}
.bubble:hover {
transform: scale(1.1);
}
.bubble img {
width: 40px;
height: 40px;
}
.bubble-tooltip {
position: absolute;
right: 60px;
background-color: #333;
color: white;
padding: 5px 10px;
border-radius: 4px;
font-size: 14px;
white-space: nowrap;
opacity: 0;
visibility: hidden;
transition:
opacity 0.3s ease,
visibility 0.3s ease;
}
.bubble:hover .bubble-tooltip {
opacity: 1;
visibility: visible;
}
.floating-bubbles:hover .bubble-tooltip {
opacity: 1;
visibility: visible;
}
/* Hide on mobile */
@media (max-width: 768px) {
.floating-bubbles {
display: none;
}
}

View File

@@ -0,0 +1,37 @@
[data-md-color-scheme="default"] {
--md-default-bg-color: #fff7ec;
--md-primary-fg-color: #000000;
--md-typeset-a-color: #0099ff;
--md-code-bg-color: #e7e7e7;
}
[data-md-color-scheme="slate"] {
--md-primary-fg-color: #000000;
--md-default-fg-color: #fff7ec;
--md-default-bg-color: #111111;
}
.clickable-banner {
color: #000000;
}
.md-main__inner.md-grid,
.md-grid {
max-width: 64rem;
}
@media screen and (min-width: 1220px) {
.md-main__inner.md-grid,
.md-grid {
max-width: 64rem;
}
}
.md-typeset h1,
.md-typeset h2,
.md-typeset h3 {
font-weight: 400;
color: var(
--md-primary-fg-color-dark
); /* this actually works for both light and dark themes */
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 31 KiB

View File

@@ -0,0 +1,11 @@
.dark-mode-only {
display: none !important;
}
[data-md-color-scheme="slate"] .dark-mode-only {
display: inline-block !important;
}
[data-md-color-scheme="slate"] .light-mode-only {
display: none !important;
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 297 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 273 KiB

View File

@@ -0,0 +1,11 @@
// Source: https://github.com/srl-labs/learn-srlinux/pull/133/files
// this script is used to remove extra leading space when annotating shell code blocks ending with `\`
// character. See https://github.com/squidfunk/mkdocs-material/issues/3846 for more info.
document$.subscribe(() => {
const tags = document.querySelectorAll("code .se");
tags.forEach((tag) => {
if (tag.innerText.startsWith("\\")) {
tag.innerText = "\\";
}
});
});

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 10 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 12 KiB

View File

@@ -0,0 +1,16 @@
/* From https://mkdocstrings.github.io/python/usage/customization/#symbol-types */
[data-md-color-scheme="default"] {
--doc-symbol-parameter-fg-color: #df50af;
--doc-symbol-attribute-fg-color: #0079ff;
--doc-symbol-function-fg-color: #00dfa2;
--doc-symbol-method-fg-color: #00dfa2;
--doc-symbol-class-fg-color: #d1b619;
--doc-symbol-module-fg-color: #ff0060;
--doc-symbol-parameter-bg-color: #df50af1a;
--doc-symbol-attribute-bg-color: #0079ff1a;
--doc-symbol-function-bg-color: #00dfa21a;
--doc-symbol-method-bg-color: #00dfa21a;
--doc-symbol-class-bg-color: #d1b6191a;
--doc-symbol-module-bg-color: #ff00601a;
}

View File

@@ -0,0 +1,43 @@
/* Navigation cards styling */
.nav-card-link {
text-decoration: none;
color: inherit;
display: block;
}
.nav-card {
padding: 1rem;
border: 1px solid var(--md-default-fg-color--lighter);
border-radius: 0.5rem;
transition:
box-shadow 0.2s,
transform 0.2s;
}
.nav-card:hover {
box-shadow: 0 4px 8px rgba(0, 0, 0, 0.1);
transform: translateY(-2px);
}
.nav-card-header {
display: flex;
align-items: center;
margin-bottom: 0.5rem;
}
.nav-card-icon {
margin-right: 0.5rem;
font-size: 1.2rem;
color: var(--md-default-fg-color);
}
.nav-card-description {
margin: 0;
color: var(--md-default-fg-color--light);
}
.nav-card-title {
margin: 0;
font-weight: bold;
color: var(--md-default-fg-color);
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 91 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 46 KiB

View File

@@ -0,0 +1 @@
../../assets

View File

@@ -0,0 +1,10 @@
<svg width="658" height="353" viewBox="0 0 658 353" fill="none" xmlns="http://www.w3.org/2000/svg">
<path d="M55.1537 318.026C51.7794 303.904 48.0221 289.861 45.1064 275.646C37.9205 240.61 31.0843 205.503 24.1762 170.411C16.1041 129.406 7.63666 88.4686 0.481801 47.3038C-0.897936 39.3656 2.87274 30.6123 5.58506 21.6882C15.1668 20.5639 23.6075 19.2929 31.9346 19.2829C134.817 19.1586 237.7 19.2873 340.582 19.0712C350.484 19.0504 358.728 21.8786 366.073 28.1404C371.661 32.9035 377.615 36.9801 383.547 42.386C398.77 126.163 410.072 211.032 431.491 293.541C431.423 316.71 423.039 325.081 400.592 325.171C398.492 325.18 396.393 325.16 394.294 325.161C321.603 325.174 248.912 325.19 176.222 325.201C144.941 325.206 113.66 325.123 82.3807 325.257C75.0251 325.288 68.1349 325.05 62.4362 319.241C61.0433 317.821 57.6351 318.378 55.1537 318.026Z" fill="#A8A3A7"/>
<path d="M103.675 1.07191C157.678 1.0985 211.024 0.958722 266.558 0.990249C269.471 0.965039 272.376 1.10224 273.759 1.07198C304.545 1.11839 334.851 0.711146 365.626 1.33516C382.219 1.67169 396.154 10.3148 403.346 26.2693C396.468 32.4468 389.815 37.0288 383.547 42.386C377.961 37.6212 371.661 32.9035 366.073 28.1404C358.728 21.8786 350.484 19.0504 340.582 19.0712C237.7 19.2873 134.817 19.1586 31.9346 19.2829C23.6075 19.2929 14.6875 21.0059 5.58506 21.6882C11.2591 6.63185 25.4035 2.28426 39.589 1.4272C56.7915 0.387894 74.8957 1.08602 93.7812 1.07179C98.5512 1.07262 100.528 1.10091 103.675 1.07191Z" fill="#BEB2BD"/>
<path d="M515.829 280.943C514.796 283.698 513.116 285.976 510.432 288.96C492.127 288.58 474.647 288.974 457.176 288.064C457.143 285.846 457.427 281.109 457.335 278.243C476.681 278.147 495.607 280.091 515.829 280.943Z" fill="#A8A3A7"/>
<path fill-rule="evenodd" clip-rule="evenodd" d="M403.346 26.2693C396.46 31.8133 389.47 36.3877 383.547 42.386C398.682 125.68 408.684 211.738 431.24 293.46C430.514 316.575 423.037 325.081 400.59 325.171C399.364 325.176 398.138 325.172 396.912 325.167C396.039 325.164 395.165 325.16 394.291 325.161L317.505 325.175C270.41 325.185 223.315 325.194 176.219 325.201C167.217 325.202 158.216 325.196 149.214 325.19C126.935 325.176 104.657 325.162 82.3785 325.257C75.0229 325.288 68.1327 325.05 62.434 319.241C60.6676 317.44 57.3192 317.744 55.1537 318.026C62.7049 339.915 77.4495 351.235 99.7865 351.436C119.364 351.612 138.942 351.716 158.52 351.82C166.961 351.865 175.401 351.91 183.842 351.96C183.842 351.96 201.052 352.088 220.341 352.042C290.845 353.207 361.493 352.261 432.003 352.023C435.82 352.011 457.323 350.123 457.176 350.487C478.346 353.369 498.075 348.707 517.677 344.076C526.575 341.973 535.446 339.877 544.414 338.495C560.582 336.004 576.629 331.994 592.64 328.49C598.173 327.278 603.706 326.067 609.245 324.898C609.245 324.898 632.023 320.412 639.66 318.175C651.795 314.62 656.121 304.062 657.117 292.56C654.52 289.364 651.061 286.39 648.118 283.561C617.827 283.096 587.294 282.142 556.93 281.869C542.997 281.671 529.108 281.184 515.829 280.943C513.978 283.397 512.273 285.827 510.432 288.96C514.659 290.812 519.076 292.14 523.328 293.919C537.731 295.135 554.881 296.283 570.136 296.957C570.143 296.985 570.149 297.013 570.155 297.041L583.324 299.759C580.268 300.502 577.878 301.566 575.237 302.232C569.882 303.581 565.204 304.76 560.483 305.727C556.374 306.568 552.261 307.392 548.141 308.217C536.974 310.453 525.379 312.42 514.032 315.057C514.032 315.057 518.968 322.188 522.131 326.756C522.131 326.756 517.373 335.735 512.237 336.737C500.538 333.137 500.533 333.055 501.433 318.657C490.404 318.189 477.133 318.47 464.537 318.657C461.837 315.957 460.037 311.458 459.138 308.758C458.124 301.414 457.797 295.713 457.176 288.064C457.151 285.037 457.836 281.707 457.335 278.243C453.818 262.723 449.511 249.221 446.648 234.91C441.872 211.044 437.124 187.172 432.375 163.301C424.852 125.477 417.328 87.6528 409.697 49.85C408.066 41.7662 406.269 34.0301 403.346 26.2693Z" fill="#8B7887"/>
<path d="M583.324 299.759C577.025 303.359 567.877 304.615 560.827 306.058C545.581 309.18 530.271 311.333 514.032 315.057C509.379 316.521 505.122 317.939 501.433 318.657C489.365 318.895 477.28 318.731 464.537 318.657C461.837 315.957 460.037 312.358 459.138 308.758C480.584 303.391 502.186 298.639 523.328 293.919C541.327 294.665 566.589 291.391 583.324 299.759Z" fill="#BEB2BD"/>
<path d="M523.328 293.919C502.341 299.297 480.968 303.896 459.138 308.758C457.82 302.061 457.797 295.713 457.176 288.064C474.655 288.165 492.118 287.902 510.432 288.96C515.363 290.226 519.085 292.144 523.328 293.919Z" fill="#542F3F"/>
<path d="M501.433 318.657C501.433 318.657 509.637 315.888 514.032 315.057C522.131 315.057 522.129 323.707 522.131 326.756C522.136 335.837 515.042 336.19 512.237 336.737C500.538 337.637 496.74 325.947 501.433 318.657Z" fill="#542F3F"/>
<path d="M265.323 169.3C265.037 168.535 264.525 167.877 263.855 167.411C263.184 166.946 262.388 166.696 261.572 166.696H213.277L241.134 112.534C241.449 111.924 241.602 111.243 241.578 110.557C241.554 109.871 241.353 109.202 240.995 108.616C240.638 108.03 240.135 107.546 239.536 107.211C238.936 106.876 238.261 106.7 237.574 106.702C236.615 106.703 235.688 107.048 234.962 107.674L146.972 183.666C146.354 184.199 145.914 184.909 145.711 185.699C145.508 186.489 145.552 187.323 145.836 188.087C146.12 188.852 146.632 189.512 147.301 189.977C147.971 190.443 148.768 190.693 149.583 190.693H197.878L170.021 244.856C169.706 245.466 169.553 246.147 169.577 246.833C169.601 247.519 169.802 248.187 170.16 248.773C170.517 249.359 171.02 249.844 171.619 250.179C172.219 250.514 172.894 250.689 173.581 250.687C174.54 250.687 175.467 250.342 176.193 249.715L264.183 173.723C264.802 173.19 265.243 172.481 265.447 171.69C265.651 170.899 265.608 170.065 265.323 169.3Z" fill="#FFBD5D"/>
</svg>

After

Width:  |  Height:  |  Size: 5.6 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 35 KiB

View File

@@ -0,0 +1,184 @@
<?xml version="1.0" encoding="utf-8"?>
<!-- Generator: Adobe Illustrator 28.1.0, SVG Export Plug-In . SVG Version: 6.00 Build 0) -->
<svg version="1.1" id="Layer_1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" x="0px" y="0px"
viewBox="0 0 429.4 442.2" style="enable-background:new 0 0 429.4 442.2;" xml:space="preserve">
<style type="text/css">
.st0{fill-rule:evenodd;clip-rule:evenodd;}
.st1{fill:#FFF7EC;}
.st2{fill-rule:evenodd;clip-rule:evenodd;fill:#FFF7EC;}
</style>
<path d="M65.4,305c-5.4,4.4-6.5,5.3-11,3.5c-3.7-1.5-10.7-3.8-10.1-4.3c4.9-4.4,7.4-8.8,13.5-6.4c4.9,1.9,6.8,3.4,6.8,3.4
S68.3,302.4,65.4,305z"/>
<path d="M116.4,367.5c-18-7.2-35.6-14.3-53.6-21.5c5.6-7.1,9.5-8.5,16.1-5.9c22.8,8.9,45.5,17.9,68.3,26.9c1.5,0.6,3.2,1.1,3.8,2.9
c-4.3,5.9-9.4,7.5-15.7,5.1C129.1,372.6,122.9,370.1,116.4,367.5z"/>
<path d="M253.6,380.7c3.5-2.9,6.9-3,10.5-1.2c5.3,2.5,11.3,3.7,16,7c0.2,1.7-0.5,2.4-1.4,2.9c-7.6,4.5-15.2,8.9-22.8,13.4
c-1.7,1-3.5,1.4-5.3,0.8c-8-3.2-16.1-6.4-24.1-9.7c-0.9-0.4-1.8-0.9-1.9-2.1c5-5.8,8.6-7,15-4.8c2.7,1,4.7,0.7,6.8-1.1
C248.6,384.2,251,382.6,253.6,380.7z"/>
<path d="M259.5,361.1c11.1,4.3,21.8,8.5,32.4,12.8c1.7,0.7,3.9,1,4.2,3.1c0.3,2.5-2.2,3-3.8,4.1c-3.1,2.2-6.3,3-10.2,1.3
c-16.3-6.7-32.8-13-49.2-19.5c-1.7-0.7-3.5-1.2-4.4-3c3.8-5.6,8.8-7.3,14.7-5.1C248.6,356.9,253.9,359,259.5,361.1z"/>
<path d="M175,390.8c-9.1-3.9-18.3-6.9-26.9-11.3c4.2-6.3,9.4-8.2,15.5-5.8c14,5.4,27.9,10.9,41.8,16.5c1.8,0.7,4.1,1.3,4.2,3.7
c0.1,2.2-2.3,2.6-3.7,3.7c-3.6,2.9-7.4,3-11.6,1.1C188.1,395.9,181.7,393.5,175,390.8z"/>
<path d="M50.1,328.5c5.5,2.6,11.2,3.8,16,7.2c-5.6,6.9-9.2,7.9-16.2,5.2c-10.8-4.3-21.6-8.6-32.4-12.9c-1.5-0.6-3.5-1.1-3.8-2.9
c-0.5-2.3,1.9-2.8,3.2-3.9c3.4-2.9,6.8-3.3,11-1.5C35.1,322.9,42.5,325.6,50.1,328.5z"/>
<path d="M62,316.7c4.7,1.9,9,3.6,13.3,5.4c3.3,1.3,3.1,3,0.5,4.9c-4.2,3.2-8.2,4.6-13.7,2.1c-9.3-4.3-19-7.7-28.6-11.6
c-1.6-0.7-3.8-1-4.2-3.1c-0.4-2.3,2-2.7,3.3-3.9c3-2.7,6.3-3.1,10.1-1.4C48.9,311.7,55.3,314,62,316.7z"/>
<path d="M237.5,403.1c3,1.5,6.2,1.6,8.3,4c-2.4,4.1-8.5,7.8-12.3,6.5c-8.3-2.9-16.4-6.3-24.6-9.6c-0.5-0.2-0.8-0.9-1.2-1.3
c4.4-5.8,9.1-7.4,15.1-5.3C227.6,399.1,232.4,401.1,237.5,403.1z"/>
<path d="M212.7,355.1c-4.4-1.7-8.4-3.2-12.4-4.9c-1.3-0.5-3-1-3.1-2.8c-0.1-1.8,1.8-2.1,2.9-3c4.4-3.5,8.8-3.7,13.8-1.2
c4.3,2.1,8.9,3.5,13.4,5.3c1.3,0.5,2.9,1.1,3.1,2.9c0.2,1.9-1.6,2.2-2.7,3.2C223,358.8,218.1,358,212.7,355.1z"/>
<path d="M183.1,376.8c-6.2-2.7-12.5-4.1-17.9-7.8c4.7-6.5,9-7.8,15.6-5.3c4.3,1.7,8.7,3.3,13,5c1.7,0.7,3.5,1.2,4.4,2.9
C194.5,377.4,189.6,379.1,183.1,376.8z"/>
<path d="M179.1,341.4c-6.2-2.7-12.5-4.1-17.9-7.8c4.7-6.5,9-7.8,15.6-5.3c4.3,1.7,8.7,3.3,13,5c1.7,0.7,3.5,1.2,4.4,2.9
C190.6,341.9,185.7,343.7,179.1,341.4z"/>
<path d="M123,328.7c-5.4,4.4-10.7,4.2-16.4,1.1c-4.7-2.5-10.3-3.3-14.7-6.8c4.4-6.1,8.9-7.5,15.1-5.1c4.9,1.9,9.9,3.8,14.8,5.8
C124.2,324.7,125.8,326,123,328.7z"/>
<path d="M136.5,350.6c-5.4,4.4-10.7,4.2-16.4,1.1c-4.7-2.5-10.3-3.3-14.7-6.8c4.4-6.1,8.9-7.5,15.1-5.1c4.9,1.9,9.9,3.8,14.8,5.8
C137.7,346.6,139.3,347.9,136.5,350.6z"/>
<path d="M210.1,358.5c1.7,0.8,3.2,1.1,3.9,2.7c-3.5,5.9-8.4,7.6-14.5,5.3c-4.5-1.7-9-3.5-13.5-5.2c-1.7-0.7-3.6-1.1-4.2-3.1
c4-5.5,8.2-7,13.9-5C200.4,354.7,205.1,356.6,210.1,358.5z"/>
<path d="M177.3,345.2c1.7,0.8,3.2,1.1,3.9,2.7c-3.5,5.9-8.4,7.6-14.5,5.3c-4.5-1.7-9-3.5-13.5-5.2c-1.7-0.7-3.5-1.1-4.2-3.1
c4-5.5,8.2-7,13.9-5C167.7,341.4,172.3,343.3,177.3,345.2z"/>
<path d="M166.8,361.1c-5.5,4.9-11.1,6-17.5,2.4c-4.5-2.5-10-3.3-14.2-6.8c6.2-6.5,9.5-7.3,16.9-4.5c3.9,1.5,7.7,3,11.6,4.5
C165.3,357.5,167.6,358,166.8,361.1z"/>
<path d="M222.1,387.8c-3.1,2.3-6,2.3-9.2,0.9c-4.4-1.9-8.9-3.6-13.4-5.4c-1.5-0.6-3.1-1.1-3.8-2.8c4.3-5.9,8.8-7.4,14.9-5.1
c3.7,1.4,7.4,2.9,11.1,4.4c1.9,0.8,4.9,0.8,5.3,3.2C227.5,385.9,224.1,386.1,222.1,387.8z"/>
<path d="M89.1,327.4c6.7,2.9,13.4,4.6,19.4,8.3c-4.7,6.4-9.5,7.9-16,5.1c-5.5-2.4-11.7-3.5-17.1-8.2
C79.9,329.6,82.9,325.2,89.1,327.4z"/>
<path d="M75.5,305.6c0.2,6.6,5.2,10.9,10.2,10.2c3.1-0.4,5.8-1.9,7.9-3.5c2.3,0.9,3.9,2.2,1.1,4.9c-5.4,4.4-10.7,4.2-16.4,1.1
c-1.9-1-3.9-1.8-6-2.5c-3-1.1-6.1-2.2-8.7-4.3C67.3,306.6,70.9,304.8,75.5,305.6C75.5,305.6,75.5,305.6,75.5,305.6z"/>
<path d="M188.9,41.4c-0.1,37,0,114.5,0,114.5s0,6.3,0,15.4l23.7-6.3c0,0,15.9-2.6,23.7-8.6c1.4-1.1,2.2-2.2,3.1-3.4
c1.1-1.5,2.2-3,4.3-3.9c3.8-1.7,8.2-2.9,12.6-4.1c4.4-1.2,8.7-2.5,12.4-4.1c6.5-2.9,12.6-5.6,19.5-6.6c6.7-1,14.6,5.3,21.7,12.3
c5.2,5.2,11,17.5,13.2,30.3c2.1,11.7,2.5,19.3,0,29.8c-1.2,5.1-2.7,8.3-6.9,11.4c-5,3.8-10.4,4.9-16.4,6.2
c-3.6,0.8-7.5,1.6-11.5,3.1c-4.8,1.7-12.2,4.5-12.2,4.5s-3.1,16.4-6.8,26.2c-1,2.7-1.8,5.3-2.5,7.9c-0.3,1-0.5,1.9-0.8,2.8
c3.5,1.4,5.7,2.3,5.7,2.3s79.4,30.1,119.1,45.1c0.9,0.4,1.9,0.7,2.8,1c0.4,0.1,0.8,0.3,1.2,0.4c1.9-1.5,1.8-3.4,1.7-5.1
c0-0.4,0-0.8,0-1.1c0-21.3,0-42.6,0-64c0,0,0-0.1,0-0.1c0-43.8,0-87.6,0.1-131.4c0-4.5-1.8-6.4-5.8-7.9c-49.1-18-98.1-36-147.1-54.1
c0,0-0.1,0-0.1,0c-13.1-4.8-26.2-9.7-39.3-14.5c-3.2-1.2-6.4-2.3-9.5-3.5l0,0c-1.5-0.5-2.9-1.1-4.4-1.6c-1.7,2-1.5,3.9-1.4,5.8
C188.8,40.5,188.9,41,188.9,41.4z"/>
<path d="M103.5,286.8c-0.7,2.4-1.3,4.9-1.7,7.2c5.2,2.1,11.1,4.4,17.2,6.8c-0.1-0.7-0.1-1.4-0.2-2.2c-0.2-2.1-0.3-4.1-0.3-6
C113.1,290.5,108,288.5,103.5,286.8z"/>
<path d="M148.6,304.3c0.1,1.1,0.2,2.3,0.3,3.4c0.2,1.7,0.3,3.3,0.4,5c14.5,5.7,25.5,10,25.5,10S252,353,290.6,368.1
c5.4,2.1,10.8,4.3,16.3,6.4c3.3,1.3,3.8,3.1,0.7,5.2c-1.1,0.8-2.2,1.5-3.4,2.1c-5.2,3.1-10.5,6.1-15.7,9.1c0,0,0,0,0,0l0,0
c-12.2,7.1-24.4,14.1-36.3,21.7c-10.4,6.6-20.1,6.8-31.2,2.4c-49.9-20-99.9-39.9-149.9-59.8c-17.5-7-35-13.9-52.4-20.9
c-3.1-1.2-12.5-6.3-8.7-9.8c6.4-6,32.2-24.7,49.3-36.9c-0.4-1.1-0.7-2.4-0.8-3.8c-22.9,15-52.7,37-55.5,40.7
c-4.7,6.2-2.7,15.2-1.7,20c0,0.1,0,0.2,0.1,0.3c1.6,7.4,3.3,8.1,10.6,10.9c1.2,0.4,2.5,0.9,4,1.5c39.9,15.9,79.8,31.9,119.8,47.9
c28.8,11.5,57.6,23.1,86.6,34.6c9.9,2.9,19.4,3.9,28.4-1.3c13.3-7.7,26.6-15.6,39.8-23.4c3.7-2.2,7.3-4.3,11-6.5
c15.3-9,15.3-9.2,15.1-20.3c0-1.8-0.1-3.9,0-6.4c0.1-10.8-1.1-12.3-11-16.2c-17.4-6.8-34.9-13.6-52.3-20.5
c-14.8-5.8-29.7-11.6-44.5-17.4c-1.6,0.3-3.3,0.3-4.8-0.2c-2.7-0.9-5.2-2.4-7.2-4.5c-0.1-0.1-0.3-0.1-0.4-0.2
c-0.7-0.3-2.2-0.9-4.4-1.7C183.8,318.1,166.8,311.5,148.6,304.3z"/>
<path d="M126.1,327.8c-2.2,0.9-4.3,2.7-6.3,5.5c2.6,2.1,5.7,3.2,8.7,4.3c2,0.7,4.1,1.5,6,2.5c5.6,3,10.9,3.3,16.4-1.1
c2.8-2.7,1.1-4-1.1-4.9c-1.6-0.7-3.3-1.3-4.9-1.9c-1.7,2.7-4,4.5-6.7,4.4C133.3,336.2,129.3,332.8,126.1,327.8z"/>
<path d="M119.4,306.7c-4.1-0.4-7.4,1.6-10.8,6.2c2.6,2.1,5.7,3.2,8.7,4.3c1.5,0.5,2.9,1.1,4.3,1.7
C120.2,314.6,119.8,311,119.4,306.7z"/>
<path d="M96.7,309.6c1,0.4,1.9,0.7,2.7,1c4.5,1.8,5.6,1,11-3.5c2.8-2.7-0.8-3.9-0.8-3.9s-1.9-1.5-6.8-3.4c-0.7-0.3-1.3-0.4-1.9-0.6
c-0.3,2.1-0.7,4.2-1.1,6.2C99.8,305.6,98.7,307.5,96.7,309.6z"/>
<path d="M74.6,300.3c-0.1-0.8-0.2-1.5-0.2-2.2v0c0-1,0-2,0-3c-0.9-1-1.9-0.9-2.9-0.8c-0.3,0-0.6,0.1-0.9,0.1c-3.1,0-5.7-0.7-7.7-2
c-0.5,0.6-1,1.2-1.5,1.9c2.6,2.1,5.7,3.2,8.7,4.3C71.7,299,73.2,299.6,74.6,300.3z"/>
<path d="M184.9,1.5c-12-4.5-21.2,1.5-21,14.9l-0.4,161.2l14.7-3.5l4.2-1.1c0.8-50.3,1-100.5,1-150.9c37.9,13.9,75.6,28.2,113.4,42.4
c0,0,0,0,0,0c32,12.1,64,24.2,96.1,36c9.7,3.6,11,5.2,11,15.3c0,20.7,0,41.3,0,62c0,47.2,0,94.3,0,141.5c0,5.2-4.3,6.6-7.4,5.5
c-45.6-15.4-89.7-31.4-133.8-48c-1,1.9-2.3,3.8-4,5.6c-3.4,3.6-6.4,6-10,8.8c0,0,0,0,0,0c3.2,1.2,6.4,2.3,9.6,3.5
c47.8,17.1,95,34,132.6,49c9.4,3.7,17.9,4,27,0c7.8-3.4,11.5-8.6,11.5-17.2c0-7.2,0-14.4,0-21.6c0-3.6,0-7.2,0-10.8
c0-19.6,0-39.1,0-58.7c0-42.9,0-85.8,0-128.7c0-11.4-3.3-16.3-14-20.1c-3.4-1.2-6.7-2.5-10.1-3.7c-19.6-7.2-39.3-14.3-58.9-21.5
c-10.3-3.8-20.5-7.6-30.8-11.4C272,34,228.4,17.9,184.9,1.5z"/>
<path d="M208.2,276.8c0.5-0.6,1.1-1.2,1.7-1.8c4.5-5,8.9-9.9,10.4-14.2c-7.7-3-15.4-5.9-23.1-8.9c-1,0.2-2,0.4-3.1,0.8
c-1.1,0.3-2.1,0.6-3,0.9h0c-5.7,1.7-10.2,3-15.7,6.5c-1.7,1.1-3.2,2.1-4.5,3.2C183,267.7,195.5,272.3,208.2,276.8z"/>
<path d="M146.7,328.3c0.3,0.1,0.7,0.3,1,0.4l0,0c1.3,0.5,2.6,1,3.6,1.4c4.5,1.8,5.6,1,11-3.5c2.8-2.7-0.8-3.9-0.8-3.9
s-1.9-1.5-6.8-3.4c-2.3-0.9-4.1-0.9-5.7-0.2c-0.1,1-0.3,2-0.5,3.1C148.1,324.1,147.5,326.3,146.7,328.3z"/>
<path d="M305,203.4c-3.1,2-6.1,2.5-8.4-0.2c-2.2-2.6-1.9-5.6-0.2-8.4c2.1-3.5,6.5-4.5,9.5-2.4C309.1,194.8,308.9,198.6,305,203.4z"
/>
<path class="st0" d="M85.7,315.8c-5,0.7-10-3.6-10.2-10.2c0-1.1-0.3-2.3-0.5-3.6c-0.3-1.4-0.5-2.8-0.5-4v0c0-1,0-2,0-3
c-1.2-1.4-2.6-0.7-3.8-0.7c-8.7,0-13.4-5.6-11.9-14.3c2.1-12,5.8-24.1,10.9-35.2l0,0c0.7-1.5,1.4-3,0.8-5c-1.5,0.3-3,0.7-4.4,1.2
c-2.5,0.8-5,1.5-7.7,1.5c-6.6,0-11.1-2.3-13.2-7.1c-2.1-4.9-0.7-10.3,4.3-14.8c9.7-8.7,16.7-18.1,30.7-23.9
c5.5-2.3,8.8-2.9,14.6-4.4c11.4-3,29.7-5.3,29.7-5.3l53.7-12.8l34.3-9.1c0,0,15.9-2.6,23.7-8.6c3.2-2.5,3.7-5.7,7.4-7.3
c7.6-3.3,17.7-5,25-8.2c6.5-2.9,12.6-5.6,19.5-6.6c6.7-1,14.6,5.3,21.7,12.3c5.2,5.2,11,17.5,13.2,30.3c2.1,11.7,2.5,19.3,0,29.8
c-1.2,5.1-2.7,8.3-6.9,11.4c-8.1,6-17.1,5.4-28,9.3c-4.8,1.7-12.2,4.5-12.2,4.5s-3.1,16.4-6.8,26.2c-3.4,8.7-3.5,17.1-10.4,24.5
c-4.2,4.5-7.9,7.1-12.7,11c-6,4.9-12.9,11.3-17.1,15.7c-3.4,3.5-5.6,5.8-9.3,10.8c-4.6,6.1-10.7,9.2-15.8,7.6
c-6.2-2-11.5-7.5-11.7-15.4c-0.1-6,4.9-16.8,7-22.5c3.6-9.9,20.6-22,21.7-30.9c0.6-4.6-6.5-5.2-10.3-6.2c-6.2-1.7-11.3-1.5-16.4,0
c-7.2,2.2-12.3,3.4-18.8,7.4c-5.6,3.5-8.3,6.7-12.8,9.8c-7.3,5.1-8.6,4.6-11.6,10.2c-1.8,3.4-2,10-2.4,17.1c-0.6,9.5,2,15.5,0,25
c-1.3,6.2-4.8,14.6-10.5,14.3c-6.8-0.4-12-7.2-15.5-15.6c-3.2-7.6-2.8-13.1-3.7-22.2c-1-10,0.5-16.8,2.9-26.5
c1.5-5.9,3.7-10.2,6.6-14.1c2.4-3.3,6-6.3,6-6.3s-9.6,6-12.5,8.3c-4.4,3.5-8.3,9-9.2,10.9c-10.8,11.8-10.5,24.1-12.7,34.7
C99.8,305.6,94.4,314.6,85.7,315.8z M82.7,261.8L82.7,261.8c2.9-6.2,5.8-12.4,8.5-18.7c3.3-7.6,8.5-12.8,16.1-16.2
c14.8-6.5,29.6-12.9,44.7-18.6c1.9-0.7,3.8-1.3,5.7-1.6c2.1-0.3,4.1,0.2,4.8,2.6c0.6,2.2-0.8,3.5-2.4,4.5c-1.1,0.7-2.4,1.1-3.6,1.7
c0,0-8.1,3.5-12.1,5.3c0,0,0,0,0,0l0,0c-10.6,4.6-21.2,9.2-31.9,13.7c-5.8,2.5-10,6.4-12.7,12.1c-0.9,2-1.9,4.1-2.8,6.1l0,0l0,0
c-2.4,5.4-4.9,10.8-7.7,15.9c-5.9,10.6-6,23.1-6.4,34.9c-0.1,1.9-0.1,3.9,2,4.6c2,0.7,3.5-0.7,4.9-1.8c2.6-2.2,3.9-5.2,4.1-8.3
c0.5-12.1,6-21.9,13.4-30.9c4-4.9,6-9.9,11.4-14.5c4.9-4.1,13-9.6,13-9.6s11.3-4.5,18.1-7.7c6.7-3.1,10.3-5.9,15.2-7.3
c2.3-0.6,4.4-0.9,6,0c1,0.5,1.4,1.2,1.8,2.2c0.6,1.8-0.8,3.8-2.6,5c-4.1,2.7-8.5,3.6-13.7,7.7c-6,4.8-15.6,13.7-15.6,13.7
s-10.4,11-11.4,17.7c-1.4,10.1-3.1,12.3-3.1,24.2c0,3.3,0.4,26.8,11.5,29.4c5.1,1.2,3.3-11.8,3.3-11.8l-2-18.3
c0,0,0.4-13.9,4.5-21.5c2.4-4.4,5.5-6.4,8.3-8.8c1.4-1.2,11.1-8.5,18.2-13.9c7.9-6,14.6-6.8,23.9-9.1c10.6-2.7,18.9-3.4,27.6,3.1
c4.1,3.1,8.4,5,8.4,10.2c0,7.8-8.2,15.1-12.7,19.8c-5.6,5.8-11.5,12.7-14.5,19.4c-4.4,9.6-4.3,18.3,2.2,20.9c2.9,1.1,4.2-1.1,5.4-3
c10.9-17.7,23.9-21.7,35.5-32.5c8.5-7.9,9.6-16.8,15.6-28.6c5.8-11.5,1.2-24.4,7.5-36.4c0.9-1.8,1.2-3.7,1.3-5.8
c0.5-18.5-7.5-32.7-21.7-43.8c-2.7-2.1-5.4-2.5-8.8-1.5c-12.3,3.7-24.6,7.1-37,10.4c-32.8,8.8-65.8,16.9-99.2,22.6
c-10.9,1.9-20.5,4.8-28.9,11.9c-4.2,3.6-8.4,7.3-13,10.3c-3.7,2.5-9.9,6.7-8.3,10.2c4,8.6,24.4-6.7,26.8-11.1
c3.3-6.1,8.6-9.1,14.8-10.9c13.1-3.8,26.2-7.6,39.3-11.3c3.2-0.9,6.4-1.7,9.7-2.6c0,0,0,0,0,0c1.1-0.3,2.2-0.6,3.3-0.9
c2.5-0.6,4.4,0.2,5.2,2.6c0.7,2.3-0.8,3.7-2.8,4.6c-4.9,2.1-10.5,3.3-15.6,4.8c0,0,0,0,0,0c0,0,0,0,0,0c-12,3.7-23.9,7.3-35.9,10.9
c-5.1,1.5-8.8,4.4-11.1,9.3c-8.1,17-15.4,34.3-19.8,52.7c-0.5,2-1,4.1,1.6,5.1c2.5,0.9,4.2-1.1,5.1-3c1.1-2.4,2.1-5,3.2-7.7l0,0
c0.6-1.5,1.2-3,1.8-4.6C80.2,267.2,81.5,264.5,82.7,261.8z M279.6,211.5c0.2,1.7,0,3.3-0.1,5.1c-0.3,2.4-0.5,4.9,0.3,7.6
c2.8-0.9,5.6-1.9,8.3-2.8l0,0h0l0,0l0,0l0,0l0,0c6.7-2.3,13.2-4.5,19.9-6.4c4.3-1.2,5.8-3.7,6.4-7.7c1.3-9.3,1.8-18.5,0-27.8
c-2.9-14.5-9.2-26.9-22.1-35.3c-2.4-1.6-4.7-1.9-7.5-1c-6.1,2-12.3,3.9-18.4,5.9l0,0c-1.6,0.5-3.3,1-4.9,1.6c-3,1-6.1,2-9.2,2.9h0
c-1.5,0.4-3,0.9-3.4,2.8c0.8,1.4,2.1,2.2,3.4,2.9C270.4,169.5,279.5,191.4,279.6,211.5z"/>
<path class="st1" d="M82.8,261.8L82.8,261.8c-1.3,2.7-2.5,5.4-3.7,8.1c-0.7,1.6-1.2,3.1-1.8,4.6l0,0c-1,2.7-2,5.2-3.2,7.7
c-0.9,1.9-2.6,3.9-5.1,3c-2.6-0.9-2.1-3.1-1.6-5.1c4.5-18.4,11.7-35.7,19.8-52.7c2.3-4.8,6-7.7,11.1-9.3c12-3.6,24-7.2,35.9-10.9
l0,0l0,0c5.1-1.6,10.7-2.7,15.6-4.8c2-0.9,3.6-2.3,2.8-4.6c-0.8-2.4-2.7-3.2-5.2-2.6c-1.1,0.3-2.2,0.6-3.3,0.9l0,0
c-3.2,0.9-6.5,1.7-9.7,2.6c-13.1,3.7-26.2,7.5-39.3,11.3c-6.2,1.8-11.6,4.8-14.8,10.9c-2.4,4.4-22.8,19.7-26.8,11.1
c-1.6-3.5,4.5-7.6,8.3-10.2c4.6-3.1,8.8-6.8,13-10.3c8.4-7.1,18.1-10,28.9-11.9c33.5-5.7,66.4-13.9,99.2-22.6
c12.4-3.3,24.7-6.8,37-10.4c3.4-1,6.1-0.6,8.8,1.5c14.3,11.1,22.3,25.4,21.7,43.8c-0.1,2-0.4,4-1.3,5.8c-6.2,12-1.7,24.9-7.5,36.4
c-6,11.8-7.1,20.7-15.6,28.6c-11.6,10.8-24.6,14.8-35.5,32.5c-1.2,1.9-2.6,4.1-5.4,3c-6.5-2.6-6.5-11.3-2.2-20.9
c3-6.7,9-13.6,14.5-19.4c4.5-4.7,12.7-11.9,12.7-19.8c0-5.2-4.3-7.1-8.4-10.2c-8.7-6.5-17-5.8-27.6-3.1c-9.3,2.4-15.9,3.1-23.9,9.1
c-7.1,5.4-16.8,12.7-18.2,13.9c-2.8,2.4-5.9,4.4-8.3,8.8c-4.1,7.5-4.5,21.5-4.5,21.5l2,18.3c0,0,1.8,13-3.3,11.8
c-11.1-2.6-11.5-26.2-11.5-29.4c0-11.9,1.7-14.2,3.1-24.2c1-6.7,11.4-17.7,11.4-17.7s9.7-8.9,15.6-13.7c5.2-4.1,9.5-5.1,13.7-7.7
c1.9-1.2,3.2-3.2,2.6-5c-0.4-1.1-0.8-1.7-1.8-2.2c-1.7-0.9-3.8-0.6-6,0c-4.8,1.3-8.5,4.1-15.2,7.3c-6.8,3.2-18.1,7.7-18.1,7.7
s-8.1,5.5-13,9.6c-5.4,4.5-7.4,9.6-11.4,14.5c-7.5,9-12.9,18.8-13.4,30.9c-0.1,3.1-1.4,6.2-4.1,8.3c-1.4,1.2-2.8,2.5-4.9,1.8
c-2.1-0.7-2.1-2.8-2-4.6c0.3-11.8,0.5-24.3,6.4-34.9c2.8-5.2,5.3-10.6,7.7-15.9l0,0l0,0c0.9-2.1,1.9-4.1,2.8-6.1
c2.7-5.7,6.9-9.6,12.7-12.1c10.7-4.5,21.3-9.1,31.9-13.7l0,0l0,0c4-1.8,12.1-5.3,12.1-5.3c1.2-0.5,2.5-0.9,3.6-1.7
c1.6-1,3-2.3,2.4-4.5c-0.7-2.4-2.7-2.9-4.8-2.6c-2,0.3-3.9,0.9-5.7,1.6c-15.1,5.7-29.9,12.1-44.7,18.6c-7.6,3.3-12.8,8.6-16.1,16.2
C88.5,249.3,85.6,255.5,82.8,261.8z"/>
<path class="st1" d="M188.8,155.9c0,0,0-77.5,0-114.5c0-0.4,0-0.9-0.1-1.4c-0.1-1.8-0.2-3.8,1.4-5.8c1.4,0.5,2.9,1,4.4,1.6l0,0
c3.2,1.1,6.3,2.3,9.5,3.5c13.1,4.8,26.2,9.7,39.3,14.5l0.1,0c49,18,98.1,36.1,147.1,54.1c4,1.5,5.8,3.4,5.8,7.9
c-0.1,43.8-0.1,87.6-0.1,131.4v0.1c0,21.3,0,42.6,0,64c0,0.4,0,0.7,0,1.1c0.1,1.8,0.2,3.6-1.7,5.1c-0.4-0.1-0.8-0.3-1.2-0.4
c-0.9-0.3-1.9-0.6-2.8-1c-39.7-15-119.1-45.1-119.1-45.1s-2.1-0.8-5.7-2.3c-0.8,2.8-1.7,5.5-3.1,8.1c44,16.6,88.2,32.6,133.8,48
c3.1,1,7.4-0.3,7.4-5.5c0-47.2,0-94.3,0-141.5c0-20.7,0-41.3,0-62c0-10.1-1.3-11.7-11-15.3c-32.1-11.9-64.1-24-96.1-36l0,0
C259.1,50.3,221.3,36,183.4,22.1c0,50.3-0.2,100.6-1,150.9l6.4-1.7C188.8,162.2,188.8,155.9,188.8,155.9z"/>
<path class="st2" d="M93.6,312.4c2.3,0.9,3.9,2.2,1.1,4.9c-5.4,4.4-10.7,4.2-16.4,1.1c-1.9-1-3.9-1.8-6-2.5c-3-1.1-6.1-2.2-8.7-4.3
c3.6-5,7.2-6.8,11.8-6l0,0c0-1.1-0.3-2.3-0.5-3.6c-0.1-0.6-0.2-1.2-0.3-1.8c-1.5-0.7-3-1.2-4.5-1.8c-3-1.1-6.1-2.2-8.7-4.3
c0.5-0.7,1-1.3,1.5-1.9c-1.7-1.1-3-2.7-3.7-4.6c-17,12.1-42.9,30.9-49.3,36.9c-3.7,3.5,5.6,8.6,8.7,9.8c17.5,7,35,13.9,52.4,20.9
c50,19.9,99.9,39.8,149.9,59.8c11,4.4,20.8,4.2,31.2-2.4c11.9-7.5,24.1-14.6,36.3-21.7l0,0l0,0c5.2-3,10.5-6.1,15.7-9.1
c1.1-0.7,2.3-1.4,3.4-2.1c3.1-2.2,2.6-3.9-0.7-5.2c-5.4-2.1-10.8-4.3-16.3-6.4C252,353,174.8,322.8,174.8,322.8s-11-4.3-25.5-10
c0.1,2,0.1,4.1-0.2,6.4c1.6-0.6,3.4-0.7,5.7,0.2c4.9,1.9,6.8,3.4,6.8,3.4s3.6,1.2,0.8,3.9c-5.4,4.4-6.5,5.3-11,3.5
c-1-0.4-2.3-0.9-3.6-1.4l0,0c-0.3-0.1-0.7-0.3-1-0.4c-0.6,1.4-1.2,2.6-1.9,3.8c1.6,0.6,3.3,1.3,4.9,1.9c2.3,0.9,3.9,2.2,1.1,4.9
c-5.4,4.4-10.7,4.2-16.4,1.1c-1.9-1-3.9-1.8-6-2.5c-3-1.1-6.1-2.2-8.7-4.3c2-2.8,4.1-4.6,6.3-5.5c-1.3-2.1-2.5-4.4-3.6-6.9
c-0.3-0.7-0.6-1.4-0.8-2.1c-1.4-0.6-2.9-1.2-4.3-1.7c-3-1.1-6.1-2.2-8.7-4.3c3.3-4.6,6.7-6.5,10.8-6.2c-0.1-1.8-0.3-3.8-0.4-5.9
c-6.1-2.4-12-4.7-17.2-6.8c-0.3,1.8-0.6,3.6-0.9,5.4c0.6,0.1,1.2,0.3,1.9,0.6c4.9,1.9,6.8,3.4,6.8,3.4s3.6,1.2,0.8,3.9
c-5.4,4.4-6.5,5.3-11,3.5c-0.8-0.3-1.7-0.7-2.7-1C95.8,310.5,94.8,311.5,93.6,312.4z M54.5,308.5c4.5,1.8,5.6,1,11-3.5
c2.8-2.7-0.8-3.9-0.8-3.9s-1.9-1.5-6.8-3.4c-6.2-2.4-8.6,2-13.5,6.4C43.7,304.7,50.8,307,54.5,308.5z M62.8,346
c18,7.2,35.6,14.3,53.6,21.5c6.5,2.6,12.7,5.1,18.9,7.5c6.3,2.4,11.4,0.8,15.7-5.1c-0.6-1.8-2.3-2.3-3.8-2.9
c-22.8-9-45.5-18-68.3-26.9C72.3,337.5,68.4,338.9,62.8,346z M264.1,379.5c-3.7-1.7-7-1.7-10.5,1.2c-2.6,1.9-5,3.4-7.1,5.3
c-2.1,1.8-4.2,2.1-6.8,1.1c-6.4-2.3-10-1-15,4.8c0.2,1.1,1,1.7,1.9,2.1c8,3.3,16,6.5,24.1,9.7c1.8,0.7,3.6,0.2,5.3-0.8
c7.6-4.5,15.2-8.9,22.8-13.4c0.9-0.5,1.6-1.3,1.4-2.9C275.4,383.2,269.5,382,264.1,379.5z M291.9,373.9
c-10.7-4.2-21.4-8.4-32.4-12.8c-5.7-2.2-10.9-4.2-16.2-6.2c-6-2.2-10.9-0.5-14.7,5.1c0.9,1.8,2.7,2.3,4.4,3
c16.4,6.5,32.9,12.8,49.2,19.5c4,1.6,7.1,0.9,10.2-1.3c1.6-1.1,4.1-1.7,3.8-4.1C295.9,374.9,293.6,374.5,291.9,373.9z M148.1,379.5
c8.6,4.5,17.8,7.4,26.9,11.3c6.7,2.7,13.1,5.1,19.3,7.8c4.2,1.8,8,1.8,11.6-1.1c1.4-1.1,3.8-1.5,3.7-3.7c-0.1-2.4-2.4-3-4.2-3.7
c-13.9-5.5-27.8-11.1-41.8-16.5C157.5,371.3,152.3,373.2,148.1,379.5z M66.1,335.8c-4.8-3.4-10.6-4.6-16-7.2c-7.6-3-15-5.6-22.1-8.8
c-4.2-1.9-7.6-1.4-11,1.5c-1.3,1.1-3.7,1.6-3.2,3.9c0.4,1.8,2.3,2.2,3.8,2.9c10.8,4.3,21.6,8.6,32.4,12.9
C56.9,343.7,60.5,342.6,66.1,335.8z M75.2,322.1c-4.3-1.7-8.6-3.5-13.3-5.4c-6.7-2.6-13.1-5-19.3-7.7c-3.8-1.6-7-1.3-10.1,1.4
c-1.3,1.2-3.7,1.6-3.3,3.9c0.4,2,2.6,2.4,4.2,3.1c9.5,3.9,19.3,7.3,28.6,11.6c5.5,2.5,9.5,1.1,13.7-2.1
C78.3,325,78.6,323.4,75.2,322.1z M245.8,407.1c-2.1-2.4-5.3-2.5-8.3-4c-5.2-2-9.9-4-14.8-5.7c-5.9-2.2-10.6-0.5-15.1,5.3
c0.4,0.4,0.7,1.1,1.2,1.3c8.2,3.3,16.3,6.7,24.6,9.6C237.3,414.9,243.4,411.3,245.8,407.1z M200.3,350.3c4,1.6,8,3.2,12.4,4.9
c5.4,2.9,10.3,3.6,15-0.5c1.1-0.9,2.9-1.3,2.7-3.2c-0.2-1.8-1.8-2.3-3.1-2.9c-4.4-1.8-9.1-3.2-13.4-5.3c-5-2.5-9.5-2.3-13.8,1.2
c-1.1,0.9-3,1.2-2.9,3C197.3,349.3,199,349.7,200.3,350.3z M165.1,369c5.4,3.6,11.8,5.1,17.9,7.8c6.6,2.3,11.4,0.6,15-5.2
c-0.9-1.7-2.7-2.3-4.4-2.9c-4.3-1.7-8.7-3.3-13-5C174.2,361.2,169.8,362.6,165.1,369z M161.2,333.6c5.4,3.6,11.8,5.1,17.9,7.8
c6.6,2.3,11.4,0.6,15-5.2c-0.9-1.7-2.7-2.3-4.4-2.9c-4.3-1.7-8.7-3.3-13-5C170.2,325.8,165.9,327.1,161.2,333.6z M106.7,329.8
c5.6,3,10.9,3.3,16.4-1.1c2.8-2.7,1.1-4-1.1-4.9c-4.9-2-9.9-3.9-14.8-5.8c-6.2-2.4-10.6-1-15.1,5.1
C96.4,326.5,102,327.3,106.7,329.8z M120.2,351.7c5.6,3,10.9,3.3,16.4-1.1c2.8-2.7,1.1-4-1.1-4.9c-4.9-2-9.9-3.9-14.8-5.8
c-6.2-2.4-10.6-1-15.1,5.1C109.9,348.4,115.5,349.2,120.2,351.7z M213.9,361.2c-0.7-1.6-2.2-1.9-3.9-2.7c-5-1.9-9.6-3.8-14.3-5.4
c-5.7-2-9.9-0.4-13.9,5c0.6,2.1,2.5,2.5,4.2,3.1c4.5,1.8,9,3.5,13.5,5.2C205.6,368.8,210.4,367.1,213.9,361.2z M181.2,347.9
c-0.7-1.6-2.2-1.9-3.9-2.7c-5-1.9-9.6-3.8-14.3-5.4c-5.7-2-9.9-0.4-13.9,5c0.6,2.1,2.5,2.5,4.2,3.1c4.5,1.8,9,3.5,13.5,5.2
C172.8,355.5,177.7,353.7,181.2,347.9z M149.3,363.5c6.4,3.6,12,2.5,17.5-2.4c0.7-3.1-1.6-3.6-3.3-4.3c-3.8-1.6-7.7-3.1-11.6-4.5
c-7.4-2.8-10.7-2-16.9,4.5C139.4,360.2,144.8,361,149.3,363.5z M212.9,388.7c3.2,1.4,6.1,1.4,9.2-0.9c1.9-1.7,5.4-1.9,4.9-4.8
c-0.4-2.3-3.3-2.4-5.3-3.2c-3.7-1.5-7.4-3-11.1-4.4c-6.2-2.3-10.6-0.8-14.9,5.1c0.7,1.6,2.3,2.2,3.8,2.8
C204,385.1,208.5,386.8,212.9,388.7z M108.5,335.7c-6-3.7-12.7-5.4-19.4-8.3c-6.3-2.2-9.3,2.2-13.7,5.2c5.4,4.7,11.6,5.7,17.1,8.2
C99,343.6,103.8,342.1,108.5,335.7z"/>
<path class="st2" d="M279.5,216.6c0.2-1.7,0.4-3.4,0.1-5.1c-0.1-20.1-9.2-42-27.5-52.2c-1.3-0.7-2.6-1.4-3.4-2.9
c0.4-2,2-2.4,3.4-2.8h0c3.1-0.9,6.1-1.9,9.2-2.9c1.6-0.5,3.3-1.1,4.9-1.6l0,0c6.2-1.9,12.3-3.9,18.4-5.9c2.8-0.9,5-0.6,7.5,1
c12.9,8.3,19.2,20.7,22.1,35.3c1.8,9.2,1.4,18.5,0,27.8c-0.6,4-2.1,6.4-6.4,7.7c-6.6,1.9-13.2,4.1-19.9,6.4l0,0l0,0l0,0l0,0h0l0,0
c-2.7,0.9-5.5,1.9-8.3,2.8C278.9,221.5,279.2,219,279.5,216.6z M296.5,203.2c2.4,2.8,5.3,2.2,8.4,0.2c4-4.8,4.2-8.7,0.9-11
c-3-2.1-7.4-1.1-9.5,2.4C294.7,197.6,294.4,200.6,296.5,203.2z"/>
</svg>

After

Width:  |  Height:  |  Size: 19 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 29 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 66 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 36 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 35 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 26 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 192 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 128 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 272 KiB

View File

@@ -0,0 +1,16 @@
# Agent Computer Interface (ACI)
SWE-agent is built on the idea of an **Agent-Computer Interface** (ACI).
An ACI is essentially an a set of tools and interaction format that allows an agent to interact with a computer-based environment, to perform tasks, such as software engineering.
The SWE-agent repository is built to make it easy to invent new ACIs for agents to solve various tasks.
Just like how typical language models requires good prompt engineering, **good ACI design leads to much better results when using agents**. As we show in the SWE-agent [paper](https://arxiv.org/abs/2405.15793), a baseline agent without a well-tuned ACI does much worse than SWE-agent.
SWE-agent contains features that we discovered to be immensely helpful during the agent-computer interface design process:
1. We add a **linter** that runs when an edit command is issued, and do not let the edit command go through if the code isn't syntactically correct.
2. We supply the agent with a **special-built file viewer**, instead of having it just `cat` files. We found that this file viewer works best when displaying just 100 lines in each turn. The **file editor** that we built has commands for scrolling up and down and for performing a search within the file.
3. We supply the agent with a special-built full-directory string **searching command**. We found that it was important for this tool to succinctly list the matches- we simply list each file that had at least one match. Showing the model more context about each match proved to be too confusing for the model.
4. When commands have an empty output we return a message saying "Your command ran successfully and did not produce any output."
Read our paper for more details [here](https://arxiv.org/abs/2405.15793).

View File

@@ -0,0 +1,17 @@
# Architecture
This page walks you through the general architecture of the SWE-agent package. Want to just run it? Jump ahead to the [installation](../installation/index.md) or [usage](../usage/index.md) sections.
![architecture](../assets/architecture.png)
The central entry point to SWE-agent is the `sweagent` command line executable (1). It initializes an instance of the [`SWEEnv`](../reference/env.md) class (2) that manages the environment.
With SWE-agent 1.0, this class is now merely a thin wrapper around our [SWE-ReX](https://swe-rex.com) package.
Upon initialization, `SWEEnv` initializes the SWE-ReX _Deployment_. The Deployment either starts a local Docker container (4), or it starts the container on a remote system like modal or aws (3).
Within the container, SWE-ReX starts a shell session (5) that will be used to execute the commands.
SWE-ReX also installs the [ACI](aci.md) elements as [custom tools](../config/tools.md) (9) that are available to the shell session.
The second class that is initialized by is the [`Agent`](../reference/agent.md) (6). It can be configured with a yaml file (see [config](../config/config.md)). It's most important method is `forward()` which prompts the model and executes its action.
To prompt the model, the history (all prompts to the model together with actions and outputs) need to be sent to the LM. In order to make the best use of the context window of the model, the history gets compressed by a `HistoryProcessor` (7). The model output (8) is then interpreted by the `Agent` class (in particular, we use a [parser](../reference/parsers.md) to extract the action) and executed in the Shell session via `SWEEnv` (10).
For this, `SWEEnv` holds the SWE-ReX deployment class that talks to a server running within the docker container (11).

View File

@@ -0,0 +1,69 @@
# Project overview
This section of the documentation talks about the architecture and research goals of [SWE-agent](#swe-agent).
## SWE-agent <a name="swe-agent"></a>
SWE-agent turns LMs (e.g. GPT-4) into software engineering agents that can fix issues in GitHub repositories.
On [SWE-bench](https://github.com/SWE-bench/SWE-bench), SWE-agent resolves **12.29%** of issues, achieving the state-of-the-art performance on the full test set.
We accomplish our results by designing simple LM-centric commands and feedback formats to make it easier for the LM to browse the repository, view, edit and execute code files. We call this an 🤖 **Agent-Computer Interface (ACI)**.
Read more about the ACI [here](aci.md).
SWE-agent is built and maintained by researchers from Princeton University.
For a quick introduction, watch the following video:
<iframe width="560" height="315" src="https://www.youtube.com/embed/CeMtJ4XObAM?si=W2tyY9EpEe-v12EU" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
A longer lecture touching on the project's motivation, research findings, as well as providing a hands-on tutorial on how to install, use, and configure SWE-agent is provided here:
<iframe width="560" height="315" src="https://www.youtube.com/embed/d9gcXpiiDao" title="NeurIPS Hacker Cup AI: SWEAgent" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
For in-depth information, read our [paper](https://arxiv.org/abs/2405.15793). If you found this work helpful, please consider using the following citation:
```bibtex
@inproceedings{yang2024sweagent,
title={{SWE}-agent: Agent-Computer Interfaces Enable Automated Software Engineering},
author={John Yang and Carlos E Jimenez and Alexander Wettig and Kilian Lieret and Shunyu Yao and Karthik R Narasimhan and Ofir Press},
booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
year={2024},
url={https://arxiv.org/abs/2405.15793}
}
```
## SWE-agent <span class="enigma">EnIGMA</span> <a name="enigma"></a>
!!! warning "Compatibility"
SWE-agent <span class="enigma">EnIGMA</span> is currently only available for SWE-agent v0.7.0.
SWE-agent <span class="enigma">EnIGMA</span> adds advanced **offensive cybersecurity capabilities**.
On the [NYU CTF benchmark](https://github.com/NYU-LLM-CTF/LLM_CTF_Database), EnIGMA solves **13.5%** of the capture the flag (CTF) challenges, achieving the state-of-the-art performance on the full test set of 200 challenges, **surpassing previous agents by more than 3x** ([leaderboard](https://enigma-agent.com#results)).
We accomplish our results by extending the [🤖 ACIs](../background/aci.md) concept first introduced in SWE-agent, to the cybersecurity domain. We establish the novel [**:gear: Interactive Agent Tools** (IATs)](https://swe-agent.com/0.7/background/iat/) concept, which enables our agent to use interactive tools such as a debugger, in a multitasking way such that the agent still has access to the main shell while using the debugger.
We also use a new **Summarizer** concept integrated into the agent to deal with long context. Read more about our different summarizers [here](https://swe-agent.com/0.7/config/summarizers/).
Specific [demonstrations](../config/demonstrations.md) were built per each CTF category (cryptography, reverse-engineering, forensics, ...), to enhance the model ability to solve new tasks from the same category.
EnIGMA is built and maintained by researchers from Tel-Aviv University, New York University and Princeton University.
For a quick introduction, watch the following video:
<iframe width="560" height="315" src="https://www.youtube.com/embed/IJxqOsNFiCc?si=xtIxyCcriM9FJexK" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
For all the details, read our [paper](https://arxiv.org/abs/2409.16165). If you found this work helpful, please consider using the following citation:
```bibtex
@misc{abramovich2024enigmaenhancedinteractivegenerative,
title={EnIGMA: Enhanced Interactive Generative Model Agent for CTF Challenges},
author={Talor Abramovich and Meet Udeshi and Minghao Shao and Kilian Lieret and Haoran Xi and Kimberly Milner and Sofija Jancheska and John Yang and Carlos E. Jimenez and Farshad Khorrami and Prashanth Krishnamurthy and Brendan Dolan-Gavitt and Muhammad Shafique and Karthik Narasimhan and Ramesh Karri and Ofir Press},
year={2024},
eprint={2409.16165},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2409.16165},
}
```

View File

@@ -0,0 +1,72 @@
# Configuration
This page contains details describing how to write your own configurations to control how agents can interact with the `SWEEnv` environment.
A configuration is represented in one or more `.yaml` files, specified by the `--config` flag in the [command line interface](../usage/cl_tutorial.md), allowing you to...
* Define the [**tools**](tools.md) that agents may use to traverse + modify a codebase.
* Write [**prompts**](templates.md) that are deterministically/conditionally shown to the agent over the course of a single trajectory.
* Use [**demonstrations**](demonstrations.md) to guide the agent's behavior.
* Change the [**model behavior**](models.md) of the agent.
* Control the **input/output interface** that sits between the agent and the environment
!!! tip "Default config files"
Our default config files are in the [`config/`](https://github.com/SWE-agent/SWE-agent/tree/main/config) directory.
For multimodal support, use `config/default_mm_with_images.yaml` which includes image processing capabilities.
To use a config file, you can use the `--config` flag in the command line interface.
```bash
sweagent run --config config/your_config.yaml
sweagent run-batch --config config/your_config.yaml
```
You can also use more than one config file, e.g., `--config config/default.yaml --config my_config.yaml`
(note that you need to repeat `--config`).
Config options are merged in a nested way.
This is the current default configuration file which is loaded when no `--config` flag is provided:
<details>
<summary><code>default.yaml</code></summary>
```yaml title="config/default.yaml"
--8<-- "config/default.yaml"
```
</details>
!!! hint "Relative paths"
Relative paths in config files are resolved to the `SWE_AGENT_CONFIG_ROOT` environment variable (if set)
or the SWE-agent repository root.
## Multimodal Configuration
For working with images and vision-capable models, SWE-agent provides specialized multimodal configuration options.
These options are best demonstrated in `default_mm_with_images.yaml`.
This configuration enables full image processing capabilities:
- **SWE-bench Multimodal Image processing**: Downloads and converts GitHub issue images to base64 format for SWE-bench Multimodal instances.
- **Extended observation length**: Increases observation token limits to accommodate images
- **Image tools**: Includes `image_tools` bundle for viewing images
- **Web browsing tools**: Includes `web_browser` bundle for using web browsers
- **History processing**: Enables `image_parsing` history processor for parsing
### Key Multimodal Settings
```yaml
agent:
templates:
disable_image_processing: false # enable/disable image processing
max_observation_length: 10_000_000 # increased for images
tools:
bundles:
- path: tools/image_tools # image viewing capabilities
- path: tools/web_browser # browser automation tools
history_processors:
- type: image_parsing # process image tools outputs (required for tools to work)
```
See the [multimodal guide](../usage/multimodal.md) for detailed configuration options.

View File

@@ -0,0 +1,41 @@
# Changing the demonstrations
!!! warning "Under construction"
We're in the process of updating this page to reflect SWE-agent 1.0.0.
An important way to show LMs how to use commands and interact with the environment is through providing a demonstration - which is basically a completed [trajectory](../usage/trajectories.md) that the LM can learn from.
For simplicity we only ingest demonstrations in the from of a trajectory file. However, since trajectory files are usually JSON, you can convert them to yaml using the `sweagent traj-to-demo` command to be more human-readable and easier to edit.
Demo (yaml) files are stored in the `demos/` directory by default and consist primarily of the sequence of actions that an LM would need to take to complete a task. It's important that your demo have the proper format to be parsed by SWE-agent and your config.
## Manually creating a custom trajectory <a name="manual"></a>
You can manually generate a trajectory by running the agent with `--agent.model.name=human_thought`.
This lets you input, at each turn, the thought (ending with END_THOUGHT) and then the action (a single command).
You should then convert that trajectory into a demonstration as shown below.
To edit text in `human_thought` mode with the traditional line-based editor:
1. Run the command `edit edit_start_line:edit_end_line`
2. Write the text you want to insert. Feel free to write the text across multiple lines.
3. Press `return` then write `end_of_edit` and then press `return` again to submit the edit.
If you would like to run `human_thought` mode without having to type in a thought at each turn (for debugging for example), use `--agent.model.name=human`.
## Converting an existing trajectory into a demonstration
Here's how you can make a demo from an existing trajectory file (like the one created from the previous step):
1. Find a basic trajectory that you already like and want to use as the basis for your demo.
For instance, consider the `.traj` files in the [`trajectories/demonstrations/` folder](https://github.com/SWE-agent/SWE-agent/tree/main/trajectories/demonstrations)
or find the trajectory from the previous step (the path will be printed at the bottom).
2. Run `sweagent traj-to-demo --traj_path<path to trajectory file.traj>` to convert the trajectory to a demo.
This demo will be saved as a readable yaml file in the `demos/` directory.
3. Edit the demo by hand to make it work for your particular use case and configuration.
4. (Optional) Run `sweagent run-replay --traj_path <path to demo>` to execute the actions of the demo, have the system generate the execution output, and ensure that it works as expected.
5. Inspect the resulting trajectory to ensure it was executed correctly.
6. Specify the path to your demonstration in your [config file](config.md)
{% include-markdown "../_footer.md" %}

View File

@@ -0,0 +1,24 @@
# Environment variables
This page details all environment variables that are currently in use by SWE-agent.
* All API keys (for LMs and GitHub) can be set as an environment variable. See [here](../installation/keys.md) for more information.
* `SWE_AGENT_CONFIG_ROOT`: Used to resolve relative paths in the [config](config.md). E.g., if `SWE_AGENT_CONFIG_ROOT=/a/b/c` and you set
add a tool bundle as `tools/my_bundle`, it will be resolved to `/a/b/c/tools/my_bundle`. The default of `SWE_AGENT_CONFIG_ROOT` is the
the `swe-agent` package directory.
The following variables can only be set as environment variables, not in the config file.
If you install `swe-agent` without the `--editable` option, please make sure to set
* `SWE_AGENT_CONFIG_DIR` (default `<PACKAGE>/config`)
* `SWE_AGENT_TOOLS_DIR` (default `<PACKAGE>/tools`)
* `SWE_AGENT_TRAJECTORY_DIR` (default `<PACKAGE>/trajectories`)
In addition, the following env variables allow to configure the logging.
* `SWE_AGENT_LOG_TIME`: Add timestamps to log
* `SWE_AGENT_LOG_STREAM_LEVEL`: Level of logging that is shown on the command line interface (`TRACE` being a custom level below `DEBUG`). Will have no effect for `run-batch`.
!!! hint "Persisting environment variables"
Most environment variables can also be added to `.env` instead.

View File

@@ -0,0 +1,33 @@
# Environments
SWE-agent runs on docker images (`python:3.11` by default).
If you are running on SWE-Bench, every instance has a docker image that we pull from dockerhub.
Here's an example of a simple custom docker environment:
```dockerfile title="tiny.Dockerfile"
FROM python:3.11.10-bullseye # (1)!
ARG DEBIAN_FRONTEND=noninteractive # (2)!
ENV TZ=Etc/UTC
WORKDIR /
# Install swe-rex for faster startup
RUN pip install pipx
RUN pipx install swe-rex
RUN pipx ensurepath
ENV PATH="$PATH:/root/.local/bin/"
# Install any extra dependencies
RUN pip install flake8
SHELL ["/bin/bash", "-c"]
```
1. This is the base image that we're starting from
2. Important to disable any interactive prompts when installing things
Build it with `docker build -f tiny.Dockerfile -t swe-agent-tiny .`.
Now you can run it in the agent with `sweagent run --env.deployment.image swe-agent-tiny ...`

View File

@@ -0,0 +1 @@
config.md

View File

@@ -0,0 +1,144 @@
# Models
!!! tip "Required reading"
See [the model section](../installation/keys.md) in the installation guide for a primer before reading the rest of this page.
!!! tip "Related pages"
* See [the model config reference](../reference/model_config.md) for the full list of model options
* To control how the agent extracts the actions from the model response, see the [action parsers](../reference/parsers.md) reference
## Notes for specific models
### Local models
See [the model section](../installation/keys.md) in the installation guide.
Remember to unset spending limits and configure the action parser if you cannot support function calling.
For "cost" tracking with local models, you can optionally provide a custom `litellm_model_registry` file in your configuration.
This allows you to define custom pricing information for your local models instead of disabling cost limits entirely.
See the [local models section](../installation/keys.md#custom-model-registry-for-cost-tracking) for detailed instructions.
### Anthropic Claude
Prompt caching makes SWE-agent several times more affordable. While this is done automatically for models like `gpt-4o`,
care has to be taken for Anthropic Claude, as you need to manually set the cache break points.
For this, include the following history processor:
```yaml
agent:
history_processors:
- type: cache_control
last_n_messages: 2
```
!!! warning "Other history processors"
Other history processors might interfere with the prompt caching
if you are not careful.
However, if your history processor is only modifying the last observation,
you can combine as done [here](https://github.com/SWE-agent/SWE-agent/blob/main/config/sweagent_heavy.yaml).
Anthropic Claude gives you 4 cache break points per key.
You need two of them for a single agent run (because the break points are both used to retrieve and set the cache).
Therefore, you can only run two parallel instances of SWE-agent with [`run-batch`](../usage/batch_mode.md) per key.
To support more parallel running instances, supply multiple keys as described below.
We recommend that you check how often you hit the cache. A very simple way is to go to your trajectory directory and grep like so:
```bash
grep -o "cached_tokens=[0-9]*" django__django-11299.debug.log
```
Note that the maximum number of output tokens of Claude 3.7/4 can be extended with extra headers.
See [this issue in litellm](https://github.com/BerriAI/litellm/issues/8984) and and [swe-agent PR #1035](https://github.com/SWE-agent/SWE-agent/issues/1035)
for omore information.
Since [#1036](https://github.com/SWE-agent/SWE-agent/pull/1036) you can also manually set the maximum output tokens and override the information
from `litellm`.
To use extended thinking, you can set the following in your config:
```yaml
agent:
name: 'claude-sonnet-4-20250514'
model:
temperature: 1.
completion_kwargs:
reasoning_effort: 'high'
```
### o1
Make sure to set
```yaml
agent:
model:
top_p: null
temperature: 1.
```
as other values aren't supported by `o1`.
## Using multiple keys
We support rotating through multiple keys for [`run-batch`](../usage/batch_mode.md). For this, concatenate all keys with `:::` and set them via the `--agent.model.api_key` flag.
Every thread (i.e., every parallel running agent that is working on one task instance) will stick to one key during the entire run, i.e., this does not break prompt caching.
### Custom cost tracking
If you want to track costs for models not in the default litellm registry, you can provide a custom model registry file. This is particularly useful for:
- New models not yet supported by litellm's default registry
- Overriding default / old cost values in litellm
- Local models that you want to track "costs" for, to compare to other results
This file will override entries in the [litellm community model cost file](https://github.com/BerriAI/litellm/blob/main/model_prices_and_context_window.json).
Create a JSON file with your model's cost information following the litellm model registry format:
```json title="my_model_registry.json"
{
"ollama/llama2": {
"max_tokens": 8192,
"input_cost_per_token": 0.00002,
"output_cost_per_token": 0.00006,
"litellm_provider": "ollama",
"mode": "chat"
},
"my-custom-provider/my-new-model": {
"max_tokens": 8192,
"max_input_tokens": 8192,
"max_output_tokens": 8192,
"input_cost_per_token": 0.000001,
"output_cost_per_token": 0.000002,
"litellm_provider": "openai",
"mode": "chat"
}
}
```
Then specify this registry in your config:
```yaml title="config/your_config.yaml"
agent:
model:
litellm_model_registry: "my_model_registry.json" # Path to your custom registry
...
```
If you need to modify the tokenizer that is used when calculating costs, you can set the `custom_tokenizer` setting in the [model config](../reference/model_config.md).
## Models for testing
We also provide models for testing SWE-agent without spending any credits
* `HumanModel` and `HumanThoughtModel` will prompt for input from the user that stands in for the output of the LM. This can be used to create new [demonstrations](../config/demonstrations.md#manual).
* `ReplayModel` takes a trajectory as input and "replays it"
* `InstantEmptySubmitTestModel` will create an empty `reproduce.py` and then submit
{% include-markdown "../_footer.md" %}

View File

@@ -0,0 +1,20 @@
## Configuring templates
The following diagram illustrates where each template is shown within a single episode of solving one task instance.
![template workflow](../assets/template_workflow.png)
One of three templates can be shown per turn:
* "Next Step" (`next_step_template`): Displayed if the model's action successfully runs. The output and a prompt for the next action is shown
* "Next Step (No Output)" (`next_step_no_output_template`): Displayed if the model's action successfully runs, but does not produce any standard output (e.g. `rm`, `cd`)
* "Format Error" (`format_error_template`): Displayed if the model's response is malformed. Over the next two turns...
* If one of the model's next response is correct, the message history is updated such that the "Format Error" turn is not kept. The episode continues.
* If the model's next two responses are both malformed, the episode terminates.
!!! tip "All options"
See the [template reference](../reference/template_config.md) for all options.
{% include-markdown "../_footer.md" %}

View File

@@ -0,0 +1,80 @@
# Configuring tools
!!! seealso "Tutorials"
See the [tutorial on adding a new tool](../usage/adding_custom_tools.md)!
Tools are one one of the ways to configure and extend the agent.
Typically, there is
* The `bash` tool, allowing the agent to run shell commands (including invoking python scripts)
* Specific tools for the agent to inspect the code (file viewer, etc)
* Code editors (for example with search and replace or line range based methods)
With SWE-agent, these tools are organized in _tool bundles_.
Each tool bundle is a folder with the following structure:
```
bundle/
├── bin/
│ └── <tool executable>
│ └── <state executable>
├── config.yaml
├── install.sh
├── README.md
└── pyproject.toml
```
The `bin/` folder contains the actual tool implementation as executables.
Here's an example of a tool bundle config:
```yaml
tools:
filemap:
signature: "filemap <file_path>"
docstring: "Print the contents of a Python file, skipping lengthy function and method definitions."
arguments:
- name: file_path
type: string
description: The path to the file to be read
required: true
```
Another important key is the `state` field.
The `state` command is a special command that is executed after every action and returns a json string that we parse.
The resulting dictionary can be used to format prompt templates.
For example, for the classical SWE-agent tools, we extract the working directory and the currently open file like so:
```python title="tools/windowed/bin/_state"
#!/usr/bin/env python3
import json
import os
from pathlib import Path
from registry import registry # type: ignore
def main():
current_file = registry.get("CURRENT_FILE")
open_file = "n/a" if not current_file else str(Path(current_file).resolve())
state = {"open_file": open_file, "working_dir": os.getcwd()}
print(json.dumps(state))
if __name__ == "__main__":
main()
```
TO use it, we set the following config key
```yaml
tools:
...
state_command: "_state"
```
To see the full specification of the state command, see the [tool config documentation](../reference/bundle_config.md).

View File

@@ -0,0 +1,147 @@
# Contribute to SWE-agent
!!! tip "Formatting change"
We've recently added automated formatting to our code base.
If you are dealing with merge-conflicts when opening a PR or updating your fork,
please first install `pre-commit` and run `pre-commit run --all-files` and try again.
{%
include-markdown "../../CONTRIBUTING.md"
start="<!-- INCLUSION START -->"
end="<!-- INCLUSION END -->"
%}
Wanna do more and actually contribute code? Great! Please see the following sections for tips and guidelines!
## Development repository set-up
Please install the repository from source, following our [usual instructions](../installation/source.md) but add the `[dev]` option to the `pip` command (you can just run the command again):
```bash
pip install -e '.[dev]'
```
Then, make sure to set up [`pre-commit`](https://pre-commit.com):
```bash
# cd to our repo root
pre-commit install
```
`pre-commit` will check for formatting and basic syntax errors before your commits.
!!! tip "Autofixes"
Most problems (including formatting) will be automatically fixed.
Therefore, if `pre-commit`/`git commit` fails on its first run, simply try running it a second time.
Some more autofixes can be enabled with the `--unsafe-fixes` option from [`ruff`](https://github.com/astral-sh/ruff):
```bash
pipx run ruff check --fix --unsafe-fixes
```
## Running tests
We provide a lot of tests that can be very helpful for rapid development.
Run them with
```bash
pytest
```
Some of the tests might be slower than others. You can exclude them with
```bash
pytest -m "not slow"
```
You can run all tests in parallel with `pytest-xdist`:
```bash
pytest -n auto
```
If you are using VSCode, you might want to add the following two files:
<details>
<summary><code>.vscode/launch.json</code></summary>
```json
--8<-- "docs/dev/vscode_launch.json"
```
</details>
<details>
<summary><code>.vscode/settings.json</code></summary>
```json
--8<-- "docs/dev/vscode_settings.json"
```
</details>
## Debugging
We recommend to install `pdbpp` for some improved debugger features:
```bash
pip install pdbpp
```
Set breakpoints with `breakpoint()` and then run `sweagent` with `pdb`:
```bash
pdb -m sweagent <command> -- <more command line arguments> # (1)!
```
1. Note the `--` before the options passed to sweagent. This is to separate
options passed to `pdb` from those that are passed to `sweagent`.
When working on a test that fails, you can use
```
pytest -k name_of_test -s --capture=no --log-cli-level=DEBUG
```
to see all debug output from the agent.
## Tips for pull requests
* If you see a lot of formatting-related merge conflicts, please see [here](formatting_conflicts.md).
* Please open separate PRs for separate issues. This makes it easier to incorporate part of your changes.
* It might be good to open an issue and discuss first before investing time on an experimental feature.
* Don't know where to get started? Look for issues marked [👋 good first issue][gfi] or [🙏 help wanted][help_wanted]
* When changing the behavior of the agent, we need to have some indication that it actually improves the success rate of SWE-agent.
However, if you make the behavior optional without complicating SWE-agent (for example by providing new [commands](../config/tools.md)),
we might be less strict.
* Please add simple unit tests or integration tests wherever possible. Take a look in the [tests directory](https://github.com/SWE-agent/SWE-agent/tree/main/tests)
for inspiration. We emphasize simple easy-to-write tests that get a lot of coverage.
[gfi]: https://github.com/SWE-agent/SWE-agent/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated-desc+label%3A%22%F0%9F%91%8B+good+first+issue%22+
[help_wanted]: https://github.com/SWE-agent/SWE-agent/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated-desc+label%3A%22%F0%9F%99%8F+help+wanted%22
## Building the documentation <a name="mkdocs"></a>
Simply run
```bash
# cd repo root
mkdocs serve
```
and point your browser to port 8000 or click one of the links in the output.
## Diving into the code
<div class="grid cards">
<a href="../reference/" class="nav-card-link">
<div class="nav-card">
<div class="nav-card-header">
<span class="material-icons nav-card-icon">settings</span>
<span class="nav-card-title">Code structure and reference</span>
</div>
<p class="nav-card-description">Read the reference for more information on our code.</p>
</div>
</a>
</div>
{% include-markdown "../_footer.md" %}

View File

@@ -0,0 +1,65 @@
On May 28th, 2024, we introduced automated formatting with `ruff-format` and `pre-commit`. This changed almost every file in the project.
If you forked or branched off before these changes and now try to synchronize your fork/branch with `SWE-agent/SWE-agent:main`, you will
see a lot of merge conflicts.
To solve this, you need to apply the same formatting to your code. Here's how you can do it.
First let's add the official remote (if it exists, you've probably already added it and you can ignore the warning).
```bash
git remote add upstream https://github.com/SWE-agent/SWE-agent.git
git fetch upstream
```
Now, you need the updated `pyproject.toml` and `.pre-commit-config.yaml` files.
We can get them from `SWE-agent/SWE-agent:main`:
```bash
git checkout upstream/main -- .pre-commit-config.yaml pyproject.toml
git commit -m "Update formatting instructions" --no-verify
```
Let's assume that your changes are on branch `FEATURE_BRANCH`, for example, if you've committed to `main`:
```bash
export FEATURE_BRANCH="main"
```
Next we create a copy of this branch (so we don't further modify it):
```bash
git branch "${FEATURE_BRANCH}_REBASED" "${FEATURE_BRANCH}"
```
And now comes the tricky bit: We rebase your changes on top of `upstream/main`, while applying
the formatting fixes at every step:
```bash
git rebase upstream/main "${FEATURE_BRANCH}_REBASED" \
-Xtheirs \
--exec 'git reset --soft HEAD^; pre-commit run; pipx run ruff check --fix --unsafe-fixes; git add -u; git commit -C HEAD@{1} --no-verify'
```
!!! note "Understanding the last command"
Here's what is happening:
* `git rebase upstream/main "${FEATURE_BRANCH}_REBASED"`
applies every commit from `"${FEATURE_BRANCH}_REBASED"` on top of `upstream/main`.
* `-Xtheirs` tells git to always take _your_ changes for merge conflicts
(rather than the format changes).
* After every commit, the command from `--exec` is being called.
* `git reset --soft HEAD^` undos the `git commit` action (while leaving the
changes staged),
* then we apply the formatting, and
* finally we commit the
formatted changes again.
!!! tip "Still merge conflicts?"
It's possible that there are non-formatting-related merge conflicts that you are encountering.
In this case, `git rebase` will stop every time it cannot resolve the conflict.
Simply fix the merge conflicts as you would normally do (edit the file, commit once done),
and then run `git rebase --continue`.
You can now open a PR from `${FEATURE_BRANCH}_REBASED` or make it your new default branch.
{% include-markdown "../_footer.md" %}

View File

@@ -0,0 +1,17 @@
{
"version": "0.2.0",
"configurations": [
{
"name": "Python: Debug Tests",
"type": "debugpy",
"request": "launch",
"program": "${file}",
"purpose": ["debug-test"],
"console": "integratedTerminal",
"env": {
"PYTEST_ADDOPTS": "--no-cov"
},
"justMyCode": false
}
]
}

View File

@@ -0,0 +1,12 @@
{
"python.testing.pytestArgs": [
"sweagent",
"--cov=sweagent/",
"--cov-report=xml:cov.xml",
"-n", "auto",
"tests"
],
"python.testing.unittestEnabled": false,
"python.testing.pytestEnabled": true,
"git.ignoreLimitWarning": true
}

62
.agent/vendor/mini-swe/docs/faq.md vendored Normal file
View File

@@ -0,0 +1,62 @@
# Frequently Asked Questions
## Basics
> Does SWE-agent run on Windows/MacOS/Linux?
Yes! Your only limitation might be the availability of the docker containers for your environments.
But you can always execute SWE-agent in the cloud.
> I got a very long error message about various configuration options not working. What's up?
This is probably because of union types.
See [this section](usage/cl_tutorial.md#union-types) for more information, but the short version is that some options (e.g., the repository or problem statement) can be specified in multiple ways, so we try every option until we find the one that works based on your inputs.
If none of them work, we throw an error which then tells you why we cannot initialize any of the types, so this will get somewhat long and confusing.
> Why are my images not being processed?
Check that you're using a multimodal configuration (see `default_mm_with_images.yaml` as an example), have internet connectivity, and images are under 10MB. See [Multimodal usage notes](usage/multimodal.md) for more details.
## Models
> What models are supported? Do you support local models?
Probably all of them, including local models! There's even a few for testing. See [models](installation/keys.md) and [more on models](config/models.md).
> Does SWE-agent support multimodal models and images?
Yes! SWE-agent supports vision-capable models that can process images from GitHub issues. Use `--config config/default_mm_with_images.yaml` and specify a multimodal model like Claude Sonnet 4 or GPT-4o. See the [multimodal guide](usage/multimodal.md) for details.
> What can I do if my model doesn't support function calling?
You can configure how to parse the model's response by choosing your `agent.tools.parse_function`.
The default now is `function_calling`, but you can change it to `thought_action`.
More information in the [reference](reference/parsers.md).
There are also some config example in our [config folder](https://github.com/SWE-agent/SWE-agent/tree/main/config).
## Configuring SWE-agent
> How can I change the demonstrations given to SWE-agent?
At the start of each run, we feed the agent a demonstration trajectory, showing it how to solve an example issue.
This substantially improves the agent's abilities to solve novel issues.
If you'd like to modify or totally change this demonstration, to better fit your use case, see [this](config/demonstrations.md).
> Can I add custom tools?
Yes! Take a look at [this tutorial](usage/adding_custom_tools.md).
## MISC
> What's up with all the output files?
You're probably most interested in the `*.traj` files, which contain complete records of SWE-agent's thought process and actions. See [output files](usage/trajectories.md) for more information.
## Anything else?
> I have a question/bug report/feature request...
Please open a [github issue!](https://github.com/SWE-agent/SWE-agent/issues)!
{% include-markdown "_footer.md" %}

89
.agent/vendor/mini-swe/docs/index.md vendored Normal file
View File

@@ -0,0 +1,89 @@
---
title: "Getting Started"
---
<style>
.md-typeset h1,
.md-content__button {
display: none;
}
</style>
<div style="text-align: center;">
<img class="light-mode-only" src="assets/readme_assets/swe-agent-banner-light.svg" alt="SWE-agent banner" style="height: 10em;">
<img class="dark-mode-only" src="assets/readme_assets/swe-agent-banner-dark.svg" alt="SWE-agent banner" style="height: 10em;">
</div>
!!! warning "We recommend mini-swe-agent instead of SWE-agent"
Most of our current development effort is on [mini-swe-agent](https://github.com/SWE-agent/mini-swe-agent/),
which has superseded SWE-agent. It matches the performance of SWE-agent, while being
much simpler.
See the [FAQ](https://mini-swe-agent.com/latest/faq/) for more details about the differences.
SWE-agent enables your language model of choice (e.g. GPT-4o or Claude Sonnet 4) to autonomously use tools to
[fix issues in real GitHub repositories](https://swe-agent.com/latest/usage/hello_world),
[find cybersecurity vulnerabilities](https://enigma-agent.com/), or
[perform any custom task](https://swe-agent.com/latest/usage/coding_challenges).
***State of the art** on SWE-bench among open-source projects
***Free-flowing & generalizable**: Leaves maximal agency to the LM
***Configurable & fully documented**: Governed by a single `yaml` file
***Made for research**: Simple & hackable by design
SWE-agent is built and maintained by researchers from Princeton University and Stanford University.
<div class="grid cards">
<a href="installation/" class="nav-card-link">
<div class="nav-card">
<div class="nav-card-header">
<span class="material-icons nav-card-icon">download</span>
<span class="nav-card-title">Installation</span>
</div>
<p class="nav-card-description">Installing SWE-agent.</p>
</div>
</a>
<a href="usage/hello_world/" class="nav-card-link">
<div class="nav-card">
<div class="nav-card-header">
<span class="material-icons nav-card-icon">settings</span>
<span class="nav-card-title">Hello world</span>
</div>
<p class="nav-card-description">Solve a GitHub issue with SWE-agent.</p>
</div>
</a>
<a href="usage/" class="nav-card-link">
<div class="nav-card">
<div class="nav-card-header">
<span class="material-icons nav-card-icon">lightbulb</span>
<span class="nav-card-title">User guides</span>
</div>
<p class="nav-card-description">Dive deeper into SWE-agent's features and goals.</p>
</div>
</a>
<a href="background/" class="nav-card-link">
<div class="nav-card">
<div class="nav-card-header">
<span class="material-icons nav-card-icon">book</span>
<span class="nav-card-title">Background & goals</span>
</div>
<p class="nav-card-description">Learn more about the project goals and academic research.</p>
</div>
</a>
</div>
## 📣 News
* July 24: [Mini-SWE-Agent](https://github.com/SWE-agent/mini-SWE-agent) achieves 65% on SWE-bench verified in 100 lines of python!
* July 9: [Multimodal support for SWE-agent](usage/multimodal.md) - Process images from GitHub issues with vision-capable AI models
* May 2: [SWE-agent-LM-32b](https://swesmith.com) achieves open-weights SOTA on SWE-bench
* Feb 28: [SWE-agent 1.0 + Claude 3.7 is SoTA on SWE-Bench full](https://x.com/KLieret/status/1895487966409298067)
* Feb 25: [SWE-agent 1.0 + Claude 3.7 is SoTA on SWE-bench verified](https://x.com/KLieret/status/1894408819670733158)
* Feb 13: [Releasing SWE-agent 1.0: SoTA on SWE-bench light & tons of new features](https://x.com/KLieret/status/1890048205448220849)
* Dec 7: [An interview with the SWE-agent & SWE-bench team](https://www.youtube.com/watch?v=fcr8WzeEXyk)
## ✍️ Doc updates
* June 26: [Adding custom tools](usage/adding_custom_tools.md)
* Apr 8: [Running SWE-agent competitively](usage/competitive_runs.md)
* Mar 7: [Updated SWE-agent architecture diagram of 1.0](background/architecture.md)

View File

@@ -0,0 +1,342 @@
# Changelog
## SWE-agent 1.1.0 (2025-05-22)
We're very excited to announce our new project [SWE-smith](https://swesmith.com/), generating 10s of thousands of training trajectories for SWE agents.
Using this training data, our LM SWE-agent-LM-32b achieves open-weights SotA on SWE-bench verified with SWE-agent!
Apart from that, v1.1.0 is mostly a fix release with minor improvements, in particular adding compatibility with SWE-bench multilingual/multimodal, and SWE-smith.
### Breaking changes
* Changes to trajectory data format. The `messages` field is replaced by `query` by [@klieret](https://github.com/klieret) in [#1107](https://github.com/princeton-nlp/SWE-agent/pull/1107)
* Renamed many tool bundles that used "windowed" file viewer (`defaults` and more) by [@klieret](https://github.com/klieret) in [#1147](https://github.com/princeton-nlp/SWE-agent/pull/1147)
* Removed `review_on_submit` tool bundle (replaced by `review_on_submit_m`) by [@klieret](https://github.com/klieret) in [#1148](https://github.com/princeton-nlp/SWE-agent/pull/1148)
* Change in `windowed` tools (formerly `default`): Don't append \n to new file by [@klieret](https://github.com/klieret) in [#1114](https://github.com/princeton-nlp/SWE-agent/pull/1114)
### Added
New dataset support:
* Feat: Support multilingual evaluation by [@kabirgh](https://github.com/kabirgh) in [#1090](https://github.com/princeton-nlp/SWE-agent/pull/1090)
* Feat: SWE-smith & multimodal base support by [@klieret](https://github.com/klieret) in [#1092](https://github.com/princeton-nlp/SWE-agent/pull/1092)
New utilities:
* Feat: Add quick-stats tool by [@klieret](https://github.com/klieret) in [#1125](https://github.com/princeton-nlp/SWE-agent/pull/1125)
### Enhanced
* Feat: Config/override max_output_tokens by [@klieret](https://github.com/klieret) in [#1036](https://github.com/princeton-nlp/SWE-agent/pull/1036)
* Enh: [#1042] fix(run_batch): handle JSON parsing errors in trajectory check by [@FRAOTIAC](https://github.com/FRAOTIAC) in [#1043](https://github.com/princeton-nlp/SWE-agent/pull/1043)
* Enh: Allow to override tools dirs etc. by [@klieret](https://github.com/klieret) in [#1046](https://github.com/princeton-nlp/SWE-agent/pull/1046)
* Enh: Allow to override path to swe-bench dataset by [@klieret](https://github.com/klieret) in [#1093](https://github.com/princeton-nlp/SWE-agent/pull/1093)
* Enh: Allow to disable python-standalone for batch by [@klieret](https://github.com/klieret) in [#1115](https://github.com/princeton-nlp/SWE-agent/pull/1115)
* Enh: More information on skipped exit status by [@klieret](https://github.com/klieret) in [#1117](https://github.com/princeton-nlp/SWE-agent/pull/1117)
### Fixed
* Fix: Setting max_input_tokens to 0 by [@klieret](https://github.com/klieret) in [#999](https://github.com/princeton-nlp/SWE-agent/pull/999)
* Fix: Explicitly set log file encoding by [@klieret](https://github.com/klieret) in [#1013](https://github.com/princeton-nlp/SWE-agent/pull/1013)
* Fix: Ensure pydantic-settings env prefix set by [@klieret](https://github.com/klieret) in [#1018](https://github.com/princeton-nlp/SWE-agent/pull/1018)
* Fix: run batch processing with modal by [@vsee](https://github.com/vsee) in [#1023](https://github.com/princeton-nlp/SWE-agent/pull/1023)
* Fix: Catch exit forfeit by [@klieret](https://github.com/klieret) in [#1024](https://github.com/princeton-nlp/SWE-agent/pull/1024)
* Fix: Use 'latest' image tag for SWE-Bench images by [@klieret](https://github.com/klieret) in [#1029](https://github.com/princeton-nlp/SWE-agent/pull/1029)
* Fix: Show tenacity retry reasons by [@klieret](https://github.com/klieret) in [#1032](https://github.com/princeton-nlp/SWE-agent/pull/1032)
* Fix: Compatibility with textual 2.0 by [@klieret](https://github.com/klieret) in [#1033](https://github.com/princeton-nlp/SWE-agent/pull/1033)
* Fix: Use default trajectories dir according to ENV by [@vsee](https://github.com/vsee) in [#1054](https://github.com/princeton-nlp/SWE-agent/pull/1054)
* Fix: fix Windows path error, replace Path with PurePosixPath or string by [@alwaysgoodtime](https://github.com/alwaysgoodtime) in [#1052](https://github.com/princeton-nlp/SWE-agent/pull/1052)
* Fix: Ensure tools PATH takes precedence by [@klieret](https://github.com/klieret) in [#1058](https://github.com/princeton-nlp/SWE-agent/pull/1058)
* Fix: Ensure state exists by [@klieret](https://github.com/klieret) in [#1065](https://github.com/princeton-nlp/SWE-agent/pull/1065)
* Fix spelling of 'agent' in hello world by [@edspencer](https://github.com/edspencer) in [#1077](https://github.com/princeton-nlp/SWE-agent/pull/1077)
* Fix: Inspector needs to handle new message format by [@klieret](https://github.com/klieret) in [#1094](https://github.com/princeton-nlp/SWE-agent/pull/1094)
* Fix: SWEBenchInstances with path and no subset initiated as other instance type by [@klieret](https://github.com/klieret) in [#1096](https://github.com/princeton-nlp/SWE-agent/pull/1096)
* Fix: Token limit exceeded for PR body issue by [@klieret](https://github.com/klieret) in [#1098](https://github.com/princeton-nlp/SWE-agent/pull/1098)
* Fix: Work around litellm claude 3.7 tokens to 128k by [@klieret](https://github.com/klieret) in [#1106](https://github.com/princeton-nlp/SWE-agent/pull/1106)
* Fix(repo): Ensure absolute path for copy repo by [@klieret](https://github.com/klieret) in [#1116](https://github.com/princeton-nlp/SWE-agent/pull/1116)
* Fix execution time timeouts by [@klieret](https://github.com/klieret) in [#1118](https://github.com/princeton-nlp/SWE-agent/pull/1118)
* Fix: Hierarchical merge of multiple configs by [@klieret](https://github.com/klieret) in [#1123](https://github.com/princeton-nlp/SWE-agent/pull/1123)
* fix message type missing by [@klieret](https://github.com/klieret) in [#1127](https://github.com/princeton-nlp/SWE-agent/pull/1127)
* Fix: Conditional for warning about empty template by [@klieret](https://github.com/klieret) in [#1137](https://github.com/princeton-nlp/SWE-agent/pull/1137)
### New Contributors
* [@vsee](https://github.com/vsee) made their first contribution in [#1023](https://github.com/princeton-nlp/SWE-agent/pull/1023)
* [@FRAOTIAC](https://github.com/FRAOTIAC) made their first contribution in [#1043](https://github.com/princeton-nlp/SWE-agent/pull/1043)
* [@jpaodev](https://github.com/jpaodev) made their first contribution in [#1050](https://github.com/princeton-nlp/SWE-agent/pull/1050)
* [@alwaysgoodtime](https://github.com/alwaysgoodtime) made their first contribution in [#1052](https://github.com/princeton-nlp/SWE-agent/pull/1052)
* [@alexgshaw](https://github.com/alexgshaw) made their first contribution in [#1056](https://github.com/princeton-nlp/SWE-agent/pull/1056)
* [@talorabr](https://github.com/talorabr) made their first contribution in [#1026](https://github.com/princeton-nlp/SWE-agent/pull/1026)
* [@katia](https://github.com/katia)-sentry made their first contribution in [#1070](https://github.com/princeton-nlp/SWE-agent/pull/1070)
* [@edspencer](https://github.com/edspencer) made their first contribution in [#1077](https://github.com/princeton-nlp/SWE-agent/pull/1077)
* [@kabirgh](https://github.com/kabirgh) made their first contribution in [#1090](https://github.com/princeton-nlp/SWE-agent/pull/1090)
**Full Changelog**: https://github.com/SWE-agent/SWE-agent/compare/v1.0.1...v1.1.0
## SWE-agent 1.0.1 (2025-02-28)
This fixup release brings fixes mostly to the compatibility with local models. We have also significantly expanded the documentation in that aspect ([models & keys documentation](https://swe-agent.com/latest/installation/keys/)).
### Changed
* Change: Make anthropic_filemap the new default config by [@klieret](https://github.com/klieret) in [#927](https://github.com/princeton-nlp/SWE-agent/pull/927)
### Added
* Enh: Set timeout for post_startup_commands by [@klieret](https://github.com/klieret) in [#973](https://github.com/princeton-nlp/SWE-agent/pull/973)
* Enh: Allow to override max_input_tokens for local models by [@klieret](https://github.com/klieret) in [#992](https://github.com/princeton-nlp/SWE-agent/pull/992)
### Fixes
* Fix: Handling local models cost lookup issues by [@klieret](https://github.com/klieret) in [#937](https://github.com/princeton-nlp/SWE-agent/pull/937)
* Fix: Requires-python >= 3.11 by [@klieret](https://github.com/klieret) in [#940](https://github.com/princeton-nlp/SWE-agent/pull/940)
* traj inspector viewport reset by [@klieret](https://github.com/klieret) in [#946](https://github.com/princeton-nlp/SWE-agent/pull/946)
* Fix: Reset viewport when next/prev step/traj by [@klieret](https://github.com/klieret) in [#948](https://github.com/princeton-nlp/SWE-agent/pull/948)
* Fix: Disable highlighting of model outputs by [@klieret](https://github.com/klieret) in [#949](https://github.com/princeton-nlp/SWE-agent/pull/949)
* Fix: Create PRs by [@klieret](https://github.com/klieret) in [#954](https://github.com/princeton-nlp/SWE-agent/pull/954)
* Fix: Add __init__,py to agent/hooks by [@RNabel](https://github.com/RNabel) in [#961](https://github.com/princeton-nlp/SWE-agent/pull/961)
* Fix: Pin textual to version 1.0.0 by [@RNabel](https://github.com/RNabel) in [#960](https://github.com/princeton-nlp/SWE-agent/pull/960)
* Fix: OpenAI API: Don't pass None tool_calls to the OpenAI API by [@RNabel](https://github.com/RNabel) in [#967](https://github.com/princeton-nlp/SWE-agent/pull/967)
* Fix: Forces platform to be linux/amd64 for swe-bench batch runs by [@carlosejimenez](https://github.com/carlosejimenez) in [#942](https://github.com/princeton-nlp/SWE-agent/pull/942)
* Fix "TypeError: Cannot read properties of null (reading 'replace')" in Trajectory viewer by [@0xba1a](https://github.com/0xba1a) in [#989](https://github.com/princeton-nlp/SWE-agent/pull/989)
* Fix: No retries if costs cannot be calculated by [@klieret](https://github.com/klieret) in [#990](https://github.com/princeton-nlp/SWE-agent/pull/990)
* Fix: Race condition/size change during iteration by [@klieret](https://github.com/klieret) in [#993](https://github.com/princeton-nlp/SWE-agent/pull/993)
* Fix: Handle total cost limit exceeded by [@klieret](https://github.com/klieret) in [#994](https://github.com/princeton-nlp/SWE-agent/pull/994)
## New Contributors
* [@RNabel](https://github.com/RNabel) made their first contribution in [#961](https://github.com/princeton-nlp/SWE-agent/pull/961)
* [@dhruvji](https://github.com/dhruvji) made their first contribution in [#963](https://github.com/princeton-nlp/SWE-agent/pull/963)
* [@0xba1a](https://github.com/0xba1a) made their first contribution in [#989](https://github.com/princeton-nlp/SWE-agent/pull/989)
**Full Changelog**: https://github.com/SWE-agent/SWE-agent/compare/v1.0.0...v1.0.1
## 1.0.0 (2025-02-13)
This is a massive release that includes many new features, fixes, and changes.
You can read more about the changes in the [migration guide](migration.md).
### Added
* Fast, massively parallel code execution with [SWE-ReX](https://github.com/swe-agent/SWE-ReX).
* Run SWE-agent locally but execute code in the cloud (using modal, AWS, or anything else that runs [SWE-ReX](https://github.com/swe-agent/SWE-ReX)).
* Configurable retry mechanisms: Try multiple agent configurations, models, parameters, etc., then choose the best one.
* Flexible tool definitions with [tool bundles](../config/tools.md).
* All language models supported using `litellm` (see [models](../installation/keys.md)).
* Override any configuration option from the command line (see [command line basics](../usage/cl_tutorial.md)).
* New [command line trajectory inspector](../usage/inspector.md) to scroll few hundreds of trajectories with ease.
* [New command line interface](../usage/cli.md) with subcommands for running over single issues, batches, and various utility commands.
* Greatly simplified and cleaned up codebase. In particular, the `Agent` class is now much easier to modify.
### Changed
* The code base has been largely rewritten. Lots of things have moved and changed.
* The biggest change is that we now use [SWE-ReX](https://github.com/swe-agent/SWE-ReX) for code execution. This allowed us to remove a lot of distracting code from the agent.
* We now use [`pydantic`](https://docs.pydantic.dev/) for all configuration.
* Templates are now [`jinja2`](https://jinja.palletsprojects.com/) templates, which gives you more flexibility (but you'll have to update your templates)
* All models are now configured using `litellm` (see [models](../installation/keys.md)).
See the [migration guide](migration.md) for more details.
### New contributors
* [@manya706](https://github.com/manya706) made their first contribution in [#787](https://github.com/princeton-nlp/SWE-agent/pull/787)
* [@Prathamesh010](https://github.com/Prathamesh010) made their first contribution in [#796](https://github.com/princeton-nlp/SWE-agent/pull/796)
* [@magnimusprime](https://github.com/magnimusprime) made their first contribution in [#813](https://github.com/princeton-nlp/SWE-agent/pull/813)
* [@dependabot](https://github.com/dependabot) made their first contribution in [#817](https://github.com/princeton-nlp/SWE-agent/pull/817)
* [@Mefisto04](https://github.com/Mefisto04) made their first contribution in [#824](https://github.com/princeton-nlp/SWE-agent/pull/824)
* [@acheshkov](https://github.com/acheshkov) made their first contribution in [#857](https://github.com/princeton-nlp/SWE-agent/pull/857)
* [@yu-iskw](https://github.com/yu-iskw) made their first contribution in [#881](https://github.com/princeton-nlp/SWE-agent/pull/881)
## 0.7.0 (2024-09-23)
### Added
The main new feature is the **EnIGMA mode**, which included additions like support for Interactive Agent Tools
and Summarizers.
* Add filemap command in the spirit of repomap by [@samuela](https://github.com/samuela) in [#619](https://github.com/princeton-nlp/SWE-agent/pull/619)
* Create config to run human eval style challenges by [@ofirpress](https://github.com/ofirpress) in [#658](https://github.com/princeton-nlp/SWE-agent/pull/658)
* Add claude 3.5 sonnet to models by [@carlosejimenez](https://github.com/carlosejimenez) in [#601](https://github.com/princeton-nlp/SWE-agent/pull/601)
* Enh: Warn if scrolling >= 3 times by [@klieret](https://github.com/klieret) in [#626](https://github.com/princeton-nlp/SWE-agent/pull/626)
* feat: support deepseek-coder LLM by [@jcraftsman](https://github.com/jcraftsman) in [#638](https://github.com/princeton-nlp/SWE-agent/pull/638)
* Enh: Make timeout for agent commands configurable by [@klieret](https://github.com/klieret) in [#674](https://github.com/princeton-nlp/SWE-agent/pull/674)
* Add support for new gpt-4o-mini model by [@ivan4722](https://github.com/ivan4722) in [#693](https://github.com/princeton-nlp/SWE-agent/pull/693)
* Groq Models Integration by [@MohammedNagdy](https://github.com/MohammedNagdy) in [#721](https://github.com/princeton-nlp/SWE-agent/pull/721)
* Make log level configurable; add TRACE level by [@klieret](https://github.com/klieret) in [#612](https://github.com/princeton-nlp/SWE-agent/pull/612)
### Fixes
* Compatibility with SWE-bench 2.0 by [@klieret](https://github.com/klieret) in [#671](https://github.com/princeton-nlp/SWE-agent/pull/671)
* ensure variables work in special command docstring by [@forresty](https://github.com/forresty) in [#628](https://github.com/princeton-nlp/SWE-agent/pull/628)
* Important fix: Catch CostLimitExceeded in retry because of format/block by [@klieret](https://github.com/klieret) in [#682](https://github.com/princeton-nlp/SWE-agent/pull/682)
* Fix: Handle empty traj in should_skip by [@klieret](https://github.com/klieret) in [#616](https://github.com/princeton-nlp/SWE-agent/pull/616)
* Fix for end-marker communicate: Exit status always 0/invalid by [@klieret](https://github.com/klieret) in [#644](https://github.com/princeton-nlp/SWE-agent/pull/644)
* Fix: Insufficient quoting of git commit message by [@klieret](https://github.com/klieret) in [#646](https://github.com/princeton-nlp/SWE-agent/pull/646)
* Fix nonsensical trajectory formatting for PRs by [@klieret](https://github.com/klieret) in [#647](https://github.com/princeton-nlp/SWE-agent/pull/647)
* Fix: sweunexpected keyword 'python_version' by [@klieret](https://github.com/klieret) in [#692](https://github.com/princeton-nlp/SWE-agent/pull/692)
* Fix: Use LONG_TIMEOUT for pre_install commands by [@klieret](https://github.com/klieret) in [#695](https://github.com/princeton-nlp/SWE-agent/pull/695)
* Fix: UnboundLocalError when catching decoding issue by [@klieret](https://github.com/klieret) in [#709](https://github.com/princeton-nlp/SWE-agent/pull/709)
* Also create empty patch files for completeness by [@klieret](https://github.com/klieret) in [#725](https://github.com/princeton-nlp/SWE-agent/pull/725)
* Fix: Raise ContextWindowExceeded instead of exit_cost by [@klieret](https://github.com/klieret) in [#727](https://github.com/princeton-nlp/SWE-agent/pull/727)
* Fix: Deal with non-utf8 encoded bytes in comm by [@klieret](https://github.com/klieret) in [#731](https://github.com/princeton-nlp/SWE-agent/pull/731)
* Fix: Handle spaces in repo names by [@klieret](https://github.com/klieret) in [#734](https://github.com/princeton-nlp/SWE-agent/pull/734)
* Fix: Ensure utils is part of package by [@klieret](https://github.com/klieret) in [#742](https://github.com/princeton-nlp/SWE-agent/pull/742)
* Fix: Submitting ' ' in human mode crashes container by [@klieret](https://github.com/klieret) in [#749](https://github.com/princeton-nlp/SWE-agent/pull/749)
* Fix: Block su as command by [@klieret](https://github.com/klieret) in [#752](https://github.com/princeton-nlp/SWE-agent/pull/752)
* Fix: SWE_AGENT_MODEL_MAX_RETRIES needs casting by [@klieret](https://github.com/klieret) in [#757](https://github.com/princeton-nlp/SWE-agent/pull/757)
### New Contributors
🎉 **[@talorabr](https://github.com/talorabr), [@udiboy1209](https://github.com/udiboy1209), [@haoranxi](https://github.com/haoranxi), [@NickNameInvalid](https://github.com/NickNameInvalid), [@rollingcoconut](https://github.com/rollingcoconut) joined the team to build EnIGMA** 🎉
* [@samefarrar](https://github.com/samefarrar) made their first contribution in [#606](https://github.com/princeton-nlp/SWE-agent/pull/606)
* [@hubstrauss](https://github.com/hubstrauss) made their first contribution in [#625](https://github.com/princeton-nlp/SWE-agent/pull/625)
* [@samuela](https://github.com/samuela) made their first contribution in [#619](https://github.com/princeton-nlp/SWE-agent/pull/619)
* [@forresty](https://github.com/forresty) made their first contribution in [#628](https://github.com/princeton-nlp/SWE-agent/pull/628)
* [@jcraftsman](https://github.com/jcraftsman) made their first contribution in [#638](https://github.com/princeton-nlp/SWE-agent/pull/638)
* [@ivan4722](https://github.com/ivan4722) made their first contribution in [#693](https://github.com/princeton-nlp/SWE-agent/pull/693)
* [@JoshuaPurtell](https://github.com/JoshuaPurtell) made their first contribution in [#703](https://github.com/princeton-nlp/SWE-agent/pull/703)
* [@MohammedNagdy](https://github.com/MohammedNagdy) made their first contribution in [#721](https://github.com/princeton-nlp/SWE-agent/pull/721)
* [@pdemro](https://github.com/pdemro) made their first contribution in [#729](https://github.com/princeton-nlp/SWE-agent/pull/729)
## 0.6.1 (2024-06-20)
[All new commits](https://github.com/SWE-agent/SWE-agent/compare/v0.6.0...v0.6.1)
This is (mostly) a patch release, in particular fixing several issues that had been introduced by the speed improvements of v0.6.0.
We also solve a bug where existing linter errors in a file left SWE-agent unable to edit (because of our lint-retry-loop).
### Breaking changes
* Change: sparse clone method is now correctly called "shallow" by [@klieret](https://github.com/klieret) in [#591](https://github.com/princeton-nlp/SWE-agent/pull/591)
### Improved
* Enh: Show commands when encountering timeout error by [@klieret](https://github.com/klieret) in [#582](https://github.com/princeton-nlp/SWE-agent/pull/582)
* Enh: Configuration option to show time in log by [@klieret](https://github.com/klieret) in [#583](https://github.com/princeton-nlp/SWE-agent/pull/583)
* Enh: Allow to configure LONG_TIMEOUT for SWEEnv by [@klieret](https://github.com/klieret) in [#584](https://github.com/princeton-nlp/SWE-agent/pull/584)
* Enh: Always write log to traj directory by [@klieret](https://github.com/klieret) in [#588](https://github.com/princeton-nlp/SWE-agent/pull/588)
### Fixed
* fix `docker.errors.NotFound` by [@klieret](https://github.com/klieret) in [#587](https://github.com/princeton-nlp/SWE-agent/pull/587)
* Fix: Revert to full clone method when needed by [@klieret](https://github.com/klieret) in [#589](https://github.com/princeton-nlp/SWE-agent/pull/589)
* Fix: Refresh container_obj before querying status by [@klieret](https://github.com/klieret) in [#590](https://github.com/princeton-nlp/SWE-agent/pull/590)
* Fixed #571 - show message that model arg is ignored in case of using Azure OpenAI by [@jank](https://github.com/jank) in [#592](https://github.com/princeton-nlp/SWE-agent/pull/592)
* Fix: Linting blocks for existing lint errors by [@klieret](https://github.com/klieret) in [#593](https://github.com/princeton-nlp/SWE-agent/pull/593)
* Fix: Process done marker not found in read with timeout by [@klieret](https://github.com/klieret) in [#596](https://github.com/princeton-nlp/SWE-agent/pull/596)
## 0.6.0 (2024-06-05)
[All new commits](https://github.com/SWE-agent/SWE-agent/compare/v0.5.0...v0.6.0)
**We sped up SWE-agent by 2x** (timed with GPT4o). This is mostly due to faster communication with the running processes inside of the Docker container and other container setup & installation related improvements. Here are a few relevant PRs:
* Switch to fast communicate and shallow clone by default by [@klieret](https://github.com/klieret) in [#530](https://github.com/princeton-nlp/SWE-agent/pull/530)
* Change: Only wait 1s for docker to start by [@klieret](https://github.com/klieret) in [#541](https://github.com/princeton-nlp/SWE-agent/pull/541)
* Feat: experimental shallow cloning by [@klieret](https://github.com/klieret) in [#498](https://github.com/princeton-nlp/SWE-agent/pull/498)
* Enh: Start from clone of python conda environment for speedup by [@klieret](https://github.com/klieret) in [#548](https://github.com/princeton-nlp/SWE-agent/pull/548)
* Enh: Use uv for editable install by default by [@klieret](https://github.com/klieret) in [#547](https://github.com/princeton-nlp/SWE-agent/pull/547)
### Improved
* Improve scrolling behavior in web UI by [@anishfish2](https://github.com/anishfish2) in [#420](https://github.com/princeton-nlp/SWE-agent/pull/420)
* Web UI: Render Markdown in agent feed messages. by [@kwight](https://github.com/kwight) in [#486](https://github.com/princeton-nlp/SWE-agent/pull/486)
* Enh: Remove redundant 'saved traj to X' messages by [@klieret](https://github.com/klieret) in [#528](https://github.com/princeton-nlp/SWE-agent/pull/528)
* Allow to disable config dump to log by [@klieret](https://github.com/klieret) in [#537](https://github.com/princeton-nlp/SWE-agent/pull/537)
* Resolve relative paths to demonstrations and commands by [@klieret](https://github.com/klieret) in [#444](https://github.com/princeton-nlp/SWE-agent/pull/444)
### Fixed
* Web UI: Remove -n option to wait by [@klieret](https://github.com/klieret) in [#487](https://github.com/princeton-nlp/SWE-agent/pull/487)
* Web UI: Kill the Flask server on exit. by [@kwight](https://github.com/kwight) in [#479](https://github.com/princeton-nlp/SWE-agent/pull/479)
* Web UI: Avoid proxy errors on MacOS by [@klieret](https://github.com/klieret) in [#506](https://github.com/princeton-nlp/SWE-agent/pull/506)
* Ensure container_name is reset for non-persistent containers by [@klieret](https://github.com/klieret) in [#463](https://github.com/princeton-nlp/SWE-agent/pull/463)
* Fix: Do not allow persistent container with cache task imgs by [@klieret](https://github.com/klieret) in [#551](https://github.com/princeton-nlp/SWE-agent/pull/551)
## 0.5.0 (2024-05-28)
[All new commits](https://github.com/SWE-agent/SWE-agent/compare/v0.4.0...v0.5.0)
✨ The big news is our [brand new documentation](https://swe-agent.com/latest/) ✨
Secondly, [@ollmer](https://github.com/ollmer) added a new flag `--cache_task_images` that will significantly speed up SWE-agent when running on the same environment/repository multiple times (no more waiting for cloning and installation!)
### Breaking changes
* We have reformatted our codebase. If you create a PR based on a previous commit, make sure you install our `pre-commit` hook to avoid merge-conflicts because of formatting. See [our docs](https://swe-agent.com/latest/dev/formatting_conflicts/) for more information.
* Remove direct imports in `__init__.py` (you can no longer `from sweagent import Agent` by [@klieret](https://github.com/klieret) in [#436](https://github.com/princeton-nlp/SWE-agent/pull/436)
### Added
* Running the web UI is now supported when running SWE-agent completely in docker
* Speed up evaluation by caching task environments as docker images by [@ollmer](https://github.com/ollmer) in [#317](https://github.com/princeton-nlp/SWE-agent/pull/317)
### Improved
* Add gpt-4o model by [@raymyers](https://github.com/raymyers) in [#344](https://github.com/princeton-nlp/SWE-agent/pull/344)
* Web: Allow to specify commit hash by [@klieret](https://github.com/klieret) in [#358](https://github.com/princeton-nlp/SWE-agent/pull/358)
* Add default environment_setup config by [@klieret](https://github.com/klieret) in [#351](https://github.com/princeton-nlp/SWE-agent/pull/351)
* Enh: Suppress openai logging; improve formatting of stats by [@klieret](https://github.com/klieret) in [#416](https://github.com/princeton-nlp/SWE-agent/pull/416)
* Remove signal dependency by [@klieret](https://github.com/klieret) in [#428](https://github.com/princeton-nlp/SWE-agent/pull/428)
* Do not use select if running on Windows by [@klieret](https://github.com/klieret) in [#429](https://github.com/princeton-nlp/SWE-agent/pull/429)
* Use custom Config class to support env and keys.cfg (this allows passing keys as environment variables) by [@klieret](https://github.com/klieret) in [#430](https://github.com/princeton-nlp/SWE-agent/pull/430)
### Fixed
* Web: Fix script_path input by [@klieret](https://github.com/klieret) in [#334](https://github.com/princeton-nlp/SWE-agent/pull/334)
* Fix: Don't print patch msg for exit_cost patch by [@klieret](https://github.com/klieret) in [#343](https://github.com/princeton-nlp/SWE-agent/pull/343)
* Fix: Do not request job control in bash by [@klieret](https://github.com/klieret) in [#345](https://github.com/princeton-nlp/SWE-agent/pull/345)
* Fix: --base_commit not used for gh urls by [@klieret](https://github.com/klieret) in [#346](https://github.com/princeton-nlp/SWE-agent/pull/346)
* Fix: Separate data path/traj dir cause exception by [@klieret](https://github.com/klieret) in [#348](https://github.com/princeton-nlp/SWE-agent/pull/348)
* Add docker-py lower bound by [@klieret](https://github.com/klieret) in [#406](https://github.com/princeton-nlp/SWE-agent/pull/406)
* Fix: IndexError when replaying incomplete trajectories by [@klieret](https://github.com/klieret) in [#410](https://github.com/princeton-nlp/SWE-agent/pull/410)
## 0.4.0 (2024-05-09)
[All new commits](https://github.com/SWE-agent/SWE-agent/compare/v0.3.0...v0.4.0)
### Added
Were excited to launch the SWE-agent web UI! Specify a bug, press start and watch SWE-agent do the magic.
## 0.3.0 (2024-05-02)
### Added
* Run SWE-agent in the cloud using GitHub Codespaces
* Add GPT4-turbo model by [@zgrannan](https://github.com/zgrannan) in [#252](https://github.com/princeton-nlp/SWE-agent/pull/252)
* feat: Amazon Bedrock support (Claude models) by [@JGalego](https://github.com/JGalego) in [#207](https://github.com/princeton-nlp/SWE-agent/pull/207)
### Fixed
* Better error handling for --open_pr by [@klieret](https://github.com/klieret) in [#239](https://github.com/princeton-nlp/SWE-agent/pull/239)
* Fixed a potential error by [@DanjieTang](https://github.com/DanjieTang) in [#242](https://github.com/princeton-nlp/SWE-agent/pull/242)
* fix: TARGETARCH not set on some OS/docker setups by [@mspronesti](https://github.com/mspronesti) in [#249](https://github.com/princeton-nlp/SWE-agent/pull/249)
* Pass Python version to get_environment_yml by [@waterson](https://github.com/waterson) in [#271](https://github.com/princeton-nlp/SWE-agent/pull/271)
* Fix Together model validation error by [@mikanfactory](https://github.com/mikanfactory) in [#236](https://github.com/princeton-nlp/SWE-agent/pull/236)
* Doc: Avoid invalid github token by [@klieret](https://github.com/klieret) in [#292](https://github.com/princeton-nlp/SWE-agent/pull/292)
## 0.2.0 (2024-04-15)
[All new commits](https://github.com/SWE-agent/SWE-agent/compare/v0.1.2...v0.2.0)
### Added
* Allow to run on local repos (new flag: `--repo_path`) in [#193](https://github.com/princeton-nlp/SWE-agent/pull/193)
* Patch files are now saved separately to a patch directory in [#126](https://github.com/princeton-nlp/SWE-agent/pull/126)
* Allow to supply custom installation commands when running on gh issues or locally (`--environment_setup`) in [#153](https://github.com/princeton-nlp/SWE-agent/pull/153)
* Allow to specify openapi base url in `keys.cfg` in [#118](https://github.com/princeton-nlp/SWE-agent/pull/118)
### Improved
* Improve error handling of docker issues in [#165](https://github.com/princeton-nlp/SWE-agent/pull/165)
* Make github token fully optional in [#189](https://github.com/princeton-nlp/SWE-agent/pull/189)
### Fixed
* Fix opening PR from fork in [#229](https://github.com/princeton-nlp/SWE-agent/pull/229)
* Fix: Choosing TogetherAI models in [#130](https://github.com/princeton-nlp/SWE-agent/pull/130)

View File

@@ -0,0 +1,12 @@
# Running SWE-agent in your browser
Running SWE-agent in your browser is the easiest way to try out our project.
<video controls>
<source src=" https://github.com/SWE-agent/SWE-agent/assets/13602468/44d60674-59ca-4986-9b22-7052a45cbed9" type="video/mp4">
</video>
1. Click [![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/SWE-agent/SWE-agent)
2. Add your language model API keys to `.env` (find the file in the left sidebar and fill out the template). More information on the keys [here](keys.md).
3. Make sure to wait until the `postCreateCommand` in the terminal window at the bottom is finished
4. Enter your SWE-agent command, see [using the command line](../usage/cl_tutorial.md).

View File

@@ -0,0 +1,33 @@
# Setting up SWE-agent
<div class="grid cards">
<a href="source/" class="nav-card-link">
<div class="nav-card">
<div class="nav-card-header">
<span class="material-icons nav-card-icon">settings</span>
<span class="nav-card-title">Install from source</span>
</div>
<p class="nav-card-description">Install SWE-agent locally from source using pip. This is the default option.</p>
</div>
</a>
<a href="codespaces/" class="nav-card-link">
<div class="nav-card">
<div class="nav-card-header">
<span class="material-icons nav-card-icon">code</span>
<span class="nav-card-title">All in browser</span>
</div>
<p class="nav-card-description">Run in a browser-based VSCode environment. Best for a quick first peek.</p>
</div>
</a>
<a href="changelog/" class="nav-card-link">
<div class="nav-card">
<div class="nav-card-header">
<span class="material-icons nav-card-icon">newspaper</span>
<span class="nav-card-title">Changelog</span>
</div>
<p class="nav-card-description">See what's new in SWE-agent!</p>
</div>
</a>
</div>

View File

@@ -0,0 +1,155 @@
# Models and API keys
!!! abstract "Setting up models"
This page shows how you can set up your LM with SWE-agent
* Generally all API models work out of the box by just adding the key and specifying `--agent.model.name`
* More care must be taken for local models (see tips below!)
## Setting API keys
In order to access the LM of your choice (and to access private GitHub repositories), you need to supply the corresponding keys.
There are three options to do this:
1. Set the corresponding [environment variables](https://www.cherryservers.com/blog/how-to-set-list-and-manage-linux-environment-variables).
2. Create a `.env` file at the root of this repository. All of the variables defined there will take the place of environment variables.
3. Use `--agent.model.api_key` to set the key
Here's an example
```bash
# Remove the comment '#' in front of the line for all keys that you have set
# GITHUB_TOKEN='GitHub Token for access to private repos'
# OPENAI_API_KEY='OpenAI API Key Here if using OpenAI Model'
# ANTHROPIC_API_KEY='Anthropic API Key Here if using Anthropic Model'
# TOGETHER_API_KEY='Together API Key Here if using Together Model'
```
See the following links for tutorials on obtaining [Anthropic](https://docs.anthropic.com/en/api/getting-started), [OpenAI](https://platform.openai.com/docs/quickstart/step-2-set-up-your-api-key), and [Github](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens) tokens.
!!! tip "Advanced settings"
See [model config](../config/models.md) for more details on advanced settings.
## Supported API models
We support all models supported by [litellm](https://github.com/BerriAI/litellm), see their list [here](https://docs.litellm.ai/docs/providers).
!!! tip "Custom model registries"
If you're using a model that's not in the default litellm model registry (e.g., custom local models or new models), you can provide a custom model registry file using the `litellm_model_registry` configuration option. This allows proper cost tracking for any model. See the [custom cost tracking section](../config/models.md#custom-cost-tracking) for details.
Here are a few options for `--agent.model.name`:
| Model | API key | Comment |
| ----- | ------- | ------- |
| `claude-sonnet-4-20250514` | `ANTHROPIC_API_KEY` | Our recommended model |
| `gpt-4o` | `OPENAI_API_KEY` | |
| `o1-preview` | `OPENAI_API_KEY` | You might need to set temperature and sampling to the supported values. |
!!! warning "Function calling and more: Setting the correct parser"
The default config uses function calling to retrieve actions from the model response, i.e.,
the model directly provides the action as a JSON object.
If your model doesn't support function calling, you can use the `thought_action` parser by setting
`agent.tools.parse_function.type` to `thought_action`.
Then, we extract the last triple-backticks block from the model response as the action.
See [our API docs](../reference/parsers.md) for more details on parsers.
Remember to document the tools in your prompt as the model will not be able to see the function signature
like with function calling.
!!! tip "Specific models"
See [model config](../config/models.md) for more details on specific models.
## Using local models
We currently support all models that serve to an endpoint with an OpenAI-compatible API.
For example, to use llama, you can follow the [litellm instructions](https://docs.litellm.ai/docs/providers/ollama) and set
```yaml title="config/your_config.yaml"
agent:
model:
name: ollama/llama2 # (1)!
api_base: http://localhost:11434
per_instance_cost_limit: 0 # (2)!
total_cost_limit: 0
per_instance_call_limit: 100
max_input_tokens: 0 # (3)!
tools:
# The default for obtaining actions from model outputs is function calling.
# If your local model does not support that, you can use the thought_action parser
# instead (see below)
parse_function:
type: "thought_action"
# You probably do not need the cache control history processor if you're not
# using Claude, so please remove it if it's in your config.
history_processors: []
```
1. Make sure that your model includes a "provider", i.e., follows the form `provider/model_name`. The model name and provider might be arbitrarily chosen.
2. We cannot track costs, so you must disable this (see below)
3. Disable max input tokens check
in your [config file](../config/config.md).
Note that you're always ingesting a config file: If you haven't specified it manually with `--config`, we're loading a default config, which might not
what you want (in particular, it uses function calling and prompt caching)!
If you're using a [litellm proxy](https://docs.litellm.ai/docs/providers/openai_compatible#usage-with-litellm-proxy-server), make sure to set your `agent.model.name` to `openai/...`
and set `agent.model.api_key` to the key you've configured for your proxy (or a random value; it cannot be empty).
!!! warning "Model providers"
Make sure that your model name includes a "provider", i.e., follows the form `provider/model_name`. The model name and provider might be arbitrarily chosen
for local models.
!!! warning "Cost/token limits"
If you do not disable the default cost limits, you will see an error because the cost calculator will not be able to find the model in the `litellm` model cost dictionary.
You have two options:
1. **Disable cost tracking** (recommended for most users): Set `per_instance_cost_limit` to 0 and use the `per_instance_call_limit` instead to limit the runtime per issue.
2. **Use a custom model registry**: If you want to track costs for your local model, you can provide a custom `litellm_model_registry` file with cost information for your model (see [here](../config/models.md#custom-cost-tracking)).
Please also make sure to set `max_input_tokens` to a non-`None` value to avoid other warnings.
!!! warning "Parsing functions"
The default config uses function calling to retrieve actions from the model response, i.e.,
the model directly provides the action as a JSON object.
If your model doesn't support function calling, you can use the `thought_action` parser by setting
`agent.tools.parse_function.type` to `thought_action`.
Then, we extract the last triple-backticks block from the model response as the action.
See [our API docs](../reference/parsers.md) for more details on parsers.
Remember to document the tools in your prompt as the model will not be able to see the function signature
like with function calling.
!!! warning "Message types"
The `cache_control` history processor requires a different message format
(e.g., `{'role': 'user', 'content': [{'type': 'text', 'text': 'some text', 'cache_control': {'type': 'ephemeral'}}]}]`).
This might not be understood by all language models.
Therefore, please remove this history processor if you do not need it
(it's currently mostly used for anthropic cache control).
See [#957](https://github.com/SWE-agent/SWE-agent/issues/957) for more information.
## Something went wrong?
* If you get `Error code: 404`, please check your configured keys, in particular
whether you set `OPENAI_API_BASE_URL` correctly (if you're not using it, the
line should be deleted or commented out).
Also see [this issue](https://github.com/SWE-agent/SWE-agent/issues/467)
for reference.
## Further reads & debugging
!!! hint "Further reads"
See [our API docs](../reference/model_config.md) for all available options.
Our [model config page](../config/models.md) has more details on specific models and tips and tricks.
{% include-markdown "../_footer.md" %}

View File

@@ -0,0 +1,47 @@
# SWE-agent 1.0.0 migration guide
Welcome to SWE-agent 1.0! So much new stuff! Here's a quick rundown of the cool new things you can do:
* :sparkles: Fast, massively parallel code execution with [SWE-ReX](https://github.com/swe-agent/SWE-ReX).
* :sparkles: Run SWE-agent locally but execute code in the cloud (using modal, AWS, or anything else that runs [SWE-ReX](https://github.com/swe-agent/SWE-ReX)).
* :sparkles: Configurable retry mechanisms: Try multiple agent configurations, models, parameters, etc., then choose the best one.
* :sparkles: Flexible tool definitions with [tool bundles](../config/tools.md).
* :sparkles: All language models supported using `litellm` (see [models](../installation/keys.md)).
* :sparkles: Override any configuration option from the command line (see [command line basics](../usage/cl_tutorial.md)).
* :sparkles: New [command line trajectory inspector](../usage/inspector.md) to scroll few hundreds of trajectories with ease.
* :sparkles: [New command line interface](../usage/cli.md) with subcommands for running over single issues, batches, and various utility commands.
* :sparkles: Greatly simplified and cleaned up codebase. In particular, the `Agent` class is now much easier to modify.
If you're familiar with the old SWE-agent, here are the main changes you need to be aware of.
!!! tip "SWE-agent 0.7"
The old SWE-agent before the major overhaul is still preserved on the `v0.7` branch.
In particular, this is still the version to use for the SWE-agent EnIGMA configuration.
You can also switch to the old documentation using the dropdown in the top menu bar.
## Command line interface and basic configuration
* Instead of the `run.py` script, we now have a central `sweagent` entry point that you can call after installing the package.
This should be much more convenient!
* Instead of one `run.py` command, we now have several subcommands: `sweagent run` to run over single issues, `sweagent run-batch` to run over a whole batch of issues, and various utility commands. Run `sweagent --help` to see all options. Splitting up both commands made it easier to make both use cases more convenient and flexible.
* We have switched to a hierarchical configuration system. This means that command line options look something like this: `--agent.model.name=gpt-4o`.
The configuration files have also been updated to reflect this.
## Environment setup
We removed the complicated mess of environment setup options (`conda` environments, `pip`, docker images, etc.).
Instead, you now always start from a docker image of your choice and we recommend that this should ship with all the dependencies you need.
However, you can also execute additional commands before starting the agent with `EnvironmentConfig.post_startup_commands`.
Additionally, every [tool bundle](../config/tools.md) can include a `setup.sh` script that will be executed, allowing to e.g., install `flake8` if needed by the tools.
## More advanced configuration
* Tools now live in tool bundles in the `tools/` directory. You can mix and match tools to your liking.
## Code
The codebase has been nearly rewritten from scratch and both more powerful and more flexible.
* The biggest change is [SWE-ReX](https://github.com/swe-agent/SWE-ReX), our new "backend" that handles all code execution.
* As a result of this, the `SWEEnv` class is basically gone and only is a small wrapper around a `swerex` runtime
* The `Agent` class also has gotten a lot simpler. It also delegates a lot of tool/execution logic to the new `Tools` class.

View File

@@ -0,0 +1,55 @@
# Installation from source
Installation from source is the preferred way to set up SWE-agent on your machine.
1. Clone the repository, for example with
```bash
git clone https://github.com/SWE-agent/SWE-agent.git
```
2. Run
```
python -m pip install --upgrade pip && pip install --editable .
```
at the repository root (as with any python setup, it's recommended to use [conda][] or [virtual environments][] to manage dependencies).
3. Set up your language model of choice as explained [here](keys.md).
Let's run a quick check:
```bash
sweagent --help
```
should show an overview over the available top-level commands.
<details>
<summary>Command not found?</summary>
You might also try <code>python -m sweagent</code>. If this also doesn't work,
please check with <code>which python</code> that you're using the same python as
when you installed SWE-agent.
</details>
**Optional installation steps:**
1. The default backend for SWE-agent is docker, so we recommend to install Docker
([follow the docs](https://github.com/docker/docker-install) or use the [get-docker.sh script for linux](https://github.com/docker/docker-install)),
then start Docker locally. Problems? See [docker issues](tips.md#docker).
If you do not want to use docker, you can still use SWE-agent with code evaluation in the cloud.
2. If you plan on using the web-based GUI: Install [`Node.js`][nodejs-install].
[nodejs-install]: https://docs.npmjs.com/downloading-and-installing-node-js-and-npm
!!! tip "Installation tips"
* If you run into docker issues, see the [installation tips section](tips.md) for more help.
* SWE-agent is still in active development. Features and enhancement are added often.
To make sure you are on the latest version, periodically run `git pull`
(there is no need to redo the `pip install`).
* SWE-agent EnIGMA is currently only compatible with `v0.7` of SWE-agent. Please run `git switch v0.7` after step 1 to switch to the correct version.
* Want to modify SWE-agent? Great! There are a few extra steps and tips:
Please check our [contribution guide](../dev/contribute.md).
[conda]: https://docs.conda.io/en/latest/
[virtual environments]: https://realpython.com/python-virtual-environments-a-primer/
{% include-markdown "../_footer.md" %}

View File

@@ -0,0 +1,26 @@
# Installation troubleshooting
## Docker issues <a name="docker"></a>
First, test if you can use docker in general, for example by running
```bash
docker run hello-world
```
If you get an error like
```
docker: permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock:
Head "http://%2Fvar%2Frun%2Fdocker.sock/_ping": dial unix /var/run/docker.sock: connect: permission denied.
```
* Make sure that you allow the use of the Docker socket. In Docker desktop, click *Settings* > *Advanced* > *Allow the default Docker socket to be used (requires password)*.
* On the command line, you can try `sudo chmod 666 /var/run/docker.sock` or add your user to the `docker` linux user group
* If your docker installation uses a different socket, you might have to symlink them, see [this command for example](https://github.com/SWE-agent/SWE-agent/issues/20#issuecomment-2047506005)
If you are using any containers from dockerhub (i.e., you ran `docker pull ...` or you are running `docker run ...`), please make sure that you are using the latest
versions. Just because an image has the `latest` tag (e.g., `sweagent/swe-agent-run:latest`) does not mean that it will auto-update. Please run
`docker pull sweagent/swe-agent-run:latest` to make sure you actually have the most recent version!
Any remaining issues? Please [open a GitHub issue](https://github.com/SWE-agent/SWE-agent/issues/new/choose)!

View File

@@ -0,0 +1,108 @@
{% extends "base.html" %}
{% block announce %}
📣 We now recommend <a href="https://mini-swe-agent.com">mini-swe-agent</a> instead of SWE-agent: Same performance, much more simple &amp; flexible
{% endblock %}
{% block content %}
{{ super() }}
<!-- Maintenance mode popup -->
<div id="maintenance-popup-overlay" style="
display: none;
position: fixed;
inset: 0;
z-index: 9999;
background: rgba(0, 0, 0, 0.2);
backdrop-filter: blur(4px);
align-items: center;
justify-content: center;
">
<div style="
background: var(--md-default-bg-color, #fff);
color: var(--md-default-fg-color, #333);
border-radius: 12px;
padding: 2.5rem 2rem 2rem;
max-width: 800px;
width: 90%;
box-shadow: 0 8px 32px rgba(0, 0, 0, 0.25);
text-align: center;
font-family: var(--md-text-font-family);
">
<div style="font-size: 0.92rem; line-height: 1.6; margin: 0 0 1.8rem;">
<p style="margin: 0 0 0.4rem;">SWE-agent has been superseded by <strong>mini-swe-agent</strong>.</p>
<p style="margin: 0 0 0.4rem;">mini-swe-agent is simpler &amp; more flexible while still being as performant.</p>
<p style="margin: 0;">See the <a href="https://mini-swe-agent.com/latest/faq/">FAQ</a> for more details about why you should switch.
<br/>SWE-agent is now in maintenance-only mode.</p>
</div>
<div style="display: flex; gap: 0.75rem; justify-content: center; flex-wrap: wrap;">
<a href="https://mini-swe-agent.com" style="
display: inline-block;
padding: 0.6rem 1.4rem;
border-radius: 6px;
background: #0099ff;
color: #fff;
text-decoration: none;
font-weight: 600;
font-size: 0.88rem;
transition: background 0.2s;
" onmouseover="this.style.background='#007acc'" onmouseout="this.style.background='#0099ff'">
Check out mini-swe-agent
</a>
<button onclick="dismissMaintenancePopup()" style="
padding: 0.6rem 1.4rem;
border-radius: 6px;
background: transparent;
border: 1px solid var(--md-default-fg-color--lighter, #ccc);
color: var(--md-default-fg-color, #333);
cursor: pointer;
font-size: 0.88rem;
transition: background 0.2s;
" onmouseover="this.style.background='var(--md-code-bg-color, #eee)'" onmouseout="this.style.background='transparent'">
Continue to SWE-agent
</button>
</div>
</div>
</div>
<script>
function dismissMaintenancePopup() {
document.getElementById('maintenance-popup-overlay').style.display = 'none';
sessionStorage.setItem('maintenance-popup-dismissed', '1');
}
(function() {
if (!sessionStorage.getItem('maintenance-popup-dismissed')) {
var overlay = document.getElementById('maintenance-popup-overlay');
overlay.style.display = 'flex';
overlay.addEventListener('click', function(e) {
if (e.target === overlay) dismissMaintenancePopup();
});
}
})();
</script>
<!-- Floating Bubbles -->
<div class="floating-bubbles">
<div class="floating-bubbles-title">Our projects</div>
<a href="https://mini-swe-agent.com/" class="bubble" title="Mini-SWE-Agent">
<img src="{{ 'assets/mini_logo.svg' | url }}" alt="Mini-SWE-Agent">
<span class="bubble-tooltip">Mini-SWE-Agent</span>
</a>
<a href="https://swe-rex.com/" class="bubble" title="SWE-rex">
<img src="{{ 'assets/swerex_logo.svg' | url }}" alt="SWE-rex">
<span class="bubble-tooltip">SWE-ReX</span>
</a>
<a href="https://swesmith.com" class="bubble" title="SWE-smith">
<img src="{{ 'assets/swesmith_logo.png' | url }}" alt="SWE-smith">
<span class="bubble-tooltip">SWE-smith</span>
</a>
<a href="https://swebench.com" class="bubble" title="SWE-bench">
<img src="{{ 'assets/swebench_logo.png' | url }}" alt="SWE-bench">
<span class="bubble-tooltip">SWE-bench</span>
</a>
<a href="https://www.swebench.com/sb-cli/" class="bubble" title="sb-cli">
<img src="{{ 'assets/sbcli_logo.svg' | url }}" alt="sb-cli">
<span class="bubble-tooltip">sb-cli</span>
</a>
</div>
{% endblock %}

View File

@@ -0,0 +1,9 @@
# The agent class
This page documents the `Agent` class, which runs the main loop of the agent.
To learn about the configuration objects used to specify the behavior of an agent,
see the [agent configuration reference page](agent_config.md).
::: sweagent.agent.agents.RetryAgent
::: sweagent.agent.agents.DefaultAgent

View File

@@ -0,0 +1,26 @@
# Agent configuration
This page documents the configuration objects used to specify the behavior of an agent.
To learn about the agent class itself, see the [agent class reference page](agent.md).
It might be easiest to simply look at some of our example configurations in the [config dir](https://github.com/SWE-agent/SWE-agent/tree/main/config).
<details>
<summary>Example: default config <code>default.yaml</code></summary>
```yaml
--8<-- "config/default.yaml"
```
</details>
Currently, there are three main agent classes:
* `DefaultAgentConfig`: This is the default agent.
* `RetryAgentConfig`: A "meta agent" that instantiates multiple agents for multiple attempts and then picks the best solution.
* `ShellAgentConfig`: Config for `ShellAgent` (invoked with `sweagent sh`), which is an experimental mode for quick & interactive (human-in-the-loop) workflows.
::: sweagent.agent.agents.RetryAgentConfig
::: sweagent.agent.agents.DefaultAgentConfig
::: sweagent.agent.agents.ShellAgentConfig

View File

@@ -0,0 +1,11 @@
# Batch instances
This page documents the batch instances functionality for running SWE-agent on multiple problems.
::: sweagent.run.batch_instances
options:
members_order: source
show_root_heading: false
show_root_toc_entry: false
show_source: false
parameter_headings: false

View File

@@ -0,0 +1,12 @@
# Tool bundle configuration
!!! note "Tool bundle configuration"
This is the page for configuring a tool bundle, not for setting up the tools that are being used for the agent.
For the latter, see [tools configuration](tools_config.md).
::: sweagent.tools.bundle.BundleConfig
::: sweagent.tools.commands.Command
::: sweagent.tools.commands.Argument

View File

@@ -0,0 +1,9 @@
# The environment class
This page documents the `SWEEnv` class, which provides the environment for the agent to interact with.
To learn about the configuration objects used to specify the behavior of the environment,
see the [environment configuration reference page](env_config.md).
::: sweagent.environment.swe_env.SWEEnv
options:
allow_inspection: false

View File

@@ -0,0 +1,6 @@
# Environment configuration
This page documents the configuration objects used to specify the behavior of the environment.
To learn about the environment class itself, see the [environment class reference page](env.md).
::: sweagent.environment.swe_env.EnvironmentConfig

View File

@@ -0,0 +1,6 @@
# Custom exceptions raised by SWE-agent
::: sweagent.exceptions
options:
show_root_full_path: false
show_bases: false

View File

@@ -0,0 +1,23 @@
# History processor configuration
History processors can filter the history/trajectory to query the model.
For example, a very simple history processor would be one that strips away old observations to reduce context when querying the model.
You can set them as follows:
```yaml
agent:
history_processors:
- type: last_n_observations
n: 5
```
::: sweagent.agent.history_processors.DefaultHistoryProcessor
::: sweagent.agent.history_processors.LastNObservations
::: sweagent.agent.history_processors.TagToolCallObservations
::: sweagent.agent.history_processors.CacheControlHistoryProcessor
::: sweagent.agent.history_processors.RemoveRegex

View File

@@ -0,0 +1,6 @@
# API Reference
This section shows all configuration options, as well as API details for the main classes.
!!! tip "SWE-agent architecture"
Before you dive into the code, please first read the [architecture page](../background/architecture.md) for an overview of SWE-agent.

View File

@@ -0,0 +1,55 @@
# Models
!!! hint "Tutorial"
Please see the [model section in the installation guide](../installation/keys.md) for an overview of the different models and how to configure them.
This page documents the configuration objects used to specify the behavior of a language model (LM).
In most cases, you will want to use the `GenericAPIModelConfig` object.
## API LMs
::: sweagent.agent.models.GenericAPIModelConfig
options:
heading_level: 3
::: sweagent.agent.models.RetryConfig
options:
heading_level: 3
## Manual models for testing
The following two models allow you to test your environment by prompting you for actions.
This can also be very useful to create your first [demonstrations](../config/demonstrations.md).
::: sweagent.agent.models.HumanModel
options:
heading_level: 3
::: sweagent.agent.models.HumanModelConfig
options:
heading_level: 3
::: sweagent.agent.models.HumanThoughtModel
options:
heading_level: 3
::: sweagent.agent.models.HumanThoughtModelConfig
options:
heading_level: 3
## Replay model for testing and demonstrations
::: sweagent.agent.models.ReplayModel
options:
heading_level: 3
::: sweagent.agent.models.ReplayModelConfig
options:
heading_level: 3
::: sweagent.agent.models.InstantEmptySubmitModelConfig
options:
heading_level: 3

View File

@@ -0,0 +1,5 @@
# Action parsers
Action parsers interpret the output of the LM and return actions (tool calls/commands to execute) and thoughts (any other output of the LM, e.g., justifying the tool calls).
::: sweagent.tools.parsing

View File

@@ -0,0 +1,57 @@
# Specifying problems
We currently support the following input types:
* A string of text (`TextProblemStatement`)
* A path to a file on the local filesystem (`FileProblemStatement`)
* A URL to a GitHub issue (`GithubIssue`)
* An empty problem statement (`EmptyProblemStatement`)
With `sweagent run`, you can specify the problem statement type with the `--problem-statement` flag.
For example:
```bash title="From text"
--problem-statement.text="This is a problem statement"
--problem-statement.type=text
```
```bash title="From a file"
--problem-statement.path=path/to/file.txt
--problem-statement.type=text_file
```
```bash title="From a GitHub issue"
--problem-statement.url=https://github.com/org/repo/issues/123
--problem-statement.type=github_issue
```
See below for more details on the configuration options.
All of these classes are defined in `sweagent.agent.problem_statement`.
::: sweagent.agent.problem_statement.problem_statement_from_simplified_input
options:
show_root_full_path: false
show_bases: false
::: sweagent.agent.problem_statement.TextProblemStatement
options:
show_root_full_path: false
show_bases: false
::: sweagent.agent.problem_statement.FileProblemStatement
options:
show_root_full_path: false
show_bases: false
::: sweagent.agent.problem_statement.GithubIssue
options:
show_root_full_path: false
show_bases: false
::: sweagent.agent.problem_statement.EmptyProblemStatement
options:
show_root_full_path: false
show_bases: false

View File

@@ -0,0 +1,44 @@
# Configuring repositories
We currently support the following repository types:
* A pre-existing repository (`PreExistingRepoConfig`)
* A local repository (`LocalRepoConfig`)
* A GitHub repository (`GithubRepoConfig`)
With `sweagent run`, you can specify the repository type with the `--env.repo` flag.
For example:
```bash title="From a pre-existing repository"
--env.repo.repo_name="testbed" # (1)!
--env.repo.type=preexisting
```
1. Folder name at the root of the deployment
```bash title="From a local repository"
--env.repo.path=/path/to/repo
--env.repo.type=local
```
All of these classes are defined in `sweagent.environment.repo`.
::: sweagent.environment.repo.PreExistingRepoConfig
options:
show_root_full_path: false
show_bases: false
::: sweagent.environment.repo.LocalRepoConfig
options:
show_root_full_path: false
show_bases: false
::: sweagent.environment.repo.GithubRepoConfig
options:
show_root_full_path: false
show_bases: false
::: sweagent.environment.repo.repo_from_simplified_input
options:
show_root_full_path: false
show_bases: false

View File

@@ -0,0 +1,5 @@
# Batch run configuration
This page documents the configuration options for running SWE-agent in batch mode on multiple problems.
::: sweagent.run.run_batch.RunBatchConfig

View File

@@ -0,0 +1,7 @@
# Single run configuration
This page documents the configuration options for running SWE-agent on a single problem.
::: sweagent.run.run_single.RunSingleConfig
::: sweagent.run.run_single.RunSingleActionConfig

View File

@@ -0,0 +1,7 @@
# Template configuration
This page documents the template configuration used by SWE-agent to define message templates and formatting.
Templates are used to format inputs and outputs into prompts/messages to query the LM.
::: sweagent.agent.agents.TemplateConfig

View File

@@ -0,0 +1,14 @@
# Tool configuration
This shows how to configure tools for SWE-agent.
!!! note "Tool configuration"
This is the page for configuring tools for SWE-agent, not for setting up the tools that are being used for the agent.
For the latter, see [tool bundles](bundle_config.md).
::: sweagent.tools.tools.ToolConfig
::: sweagent.tools.tools.ToolFilterConfig
::: sweagent.tools.bundle.Bundle

View File

@@ -0,0 +1,310 @@
# Adding Custom Tools
!!! abstract "Adding custom tools to SWE-agent"
This tutorial walks you through creating and integrating custom tools into SWE-agent.
We'll create a fun `print_cat` command that prints ASCII art of a cat for morale boost!
Please read our [hello world](hello_world.md) and [command line basics](cl_tutorial.md) tutorials before proceeding.
## Understanding Tool Structure
Every SWE-agent tool is organized as a "bundle" - a directory containing:
1. **`config.yaml`** - Defines the tool's interface and documentation
2. **`bin/`** directory - Contains the executable scripts
3. **`install.sh`** (optional) - Sets up dependencies
4. **`lib/`** (optional) - Contains shared libraries or utilities
Let's look at the simple `submit` tool structure:
```
tools/submit/
├── config.yaml
└── bin/
└── submit
```
## Step 1: Write the Command Script
First, let's create our `morale_boost` tool bundle:
```bash
mkdir -p tools/morale_boost/bin
```
Create the executable script that will be called when the agent runs your command:
```bash title="tools/morale_boost/bin/print_cat"
#!/bin/bash
# print_cat - A morale-boosting ASCII cat printer!
echo "🐱 Here's a cat to boost your morale! 🐱"
echo ""
cat << 'EOF'
/\_/\
( o.o )
> ^ <
_) (_
(_____)
EOF
echo ""
echo "You're doing great! Keep coding! 💪"
```
If you're wondering about the strange `EOF` construct, those are called
[heredocs](https://linuxize.com/post/bash-heredoc/) and enable multiline
arguments in bash.
## Step 2: Define the Tool Configuration
Create the `config.yaml` file that tells SWE-agent about your tool:
```yaml title="tools/morale_boost/config.yaml"
tools:
print_cat:
signature: "print_cat"
docstring: "Prints an ASCII art cat for morale boost. Use when you need encouragement!"
arguments: []
```
!!! tip "Tool Configuration Options"
The `config.yaml` supports various options:
- **`signature`**: How the command should be called (including arguments)
- **`docstring`**: Description that helps the AI understand when to use this tool
- **`arguments`**: List of arguments with types, descriptions, and whether they're required
For a tool with arguments, it would look like this:
```yaml title="Example with arguments"
tools:
print_animal:
signature: "print_animal <animal_type> [--size=<size>]"
docstring: "Prints ASCII art of the specified animal"
arguments:
- name: animal_type
type: string
description: "Type of animal to print (cat, dog, elephant)"
required: true
- name: size
type: string
description: "Size of the art (small, medium, large)"
required: false
```
## Step 3: Tell the agent to use the new tool
Now you need to tell SWE-agent to use your new tool. Copy `config/default.yaml` to `config/my_custom_config.yaml`
and make the following modification
```yaml title="config/my_custom_config.yaml"
agent:
templates:
instance_template: |-
(...)
Don't forget to use `print_cat` when you need encouragement!
Your thinking should be thorough and so it's fine if it's very long.
bundles:
- path: tools/registry
- path: tools/edit_anthropic
- path: tools/review_on_submit_m
- path: tools/morale_boost # Add our custom tool bundle!
# everything else stays the same
```
## Step 4: Let's test it
Now you can test your tool by running SWE-agent with your custom configuration:
```bash
sweagent run \
--config config/my_custom_config.yaml \
--agent.model.name=gpt-4o \
--env.repo.github_url=https://github.com/SWE-agent/test-repo \
--problem_statement.text="Add a simple hello world function to the repository. Feel free to use print_cat for morale!"
```
The agent should now have access to your `print_cat` command and may use it during execution!
## Advanced Tool Features
### Multiple Commands in One Bundle
You can define multiple commands in a single tool bundle:
```yaml title="tools/morale_boost/config.yaml - Extended version"
tools:
print_cat:
signature: "print_cat"
docstring: "Prints an ASCII art cat for morale boost"
arguments: []
print_dog:
signature: "print_dog"
docstring: "Prints an ASCII art dog for variety"
arguments: []
motivate:
signature: "motivate <message>"
docstring: "Prints a motivational message with ASCII art"
arguments:
- name: message
type: string
description: "The motivational message to display"
required: true
```
Don't forget to create the corresponding scripts in the `bin/` directory!
### Using Python Libraries
If you prefer Python, you can create Python-based tools:
```python title="tools/morale_boost/bin/print_cat"
#!/usr/bin/env python3
"""
A morale-boosting cat printer in Python!
"""
print("🐱 Here's a cat to boost your morale! 🐱")
```
If your Python tool needs additional dependencies, create an `install.sh` script:
```bash title="tools/morale_boost/install.sh"
#!/bin/bash
# This script runs when the tool bundle is installed
# Example: Install Python packages
pip install cowsay
pip install colorama
echo "Morale boost tools installed! Ready to spread joy! 🎉"
```
### Environment Variables and Context
Your tools can access environment variables and the current working context:
```bash title="Example using environment variables"
#!/bin/bash
echo "Current working directory: $PWD"
echo "Repository root: $ROOT"
echo "My custom variable: $MY_CUSTOM_VAR"
```
Adding environment variables in your config file can be a simple way of customizing your tools
```yaml title="config/my_custom_config.yaml"
agent:
tools:
env_variables:
PAGER: cat
MY_CUSTOM_VAR: "Hello from config!"
DEBUG_MODE: "true"
# ... rest of config
```
For sensitive data like API keys, you can propagate environment variables from your host system:
```yaml title="config/my_custom_config.yaml"
agent:
tools:
propagate_env_variables:
- "OPENAI_API_KEY"
- "GITHUB_TOKEN"
- "MY_SECRET_KEY"
# ... rest of config
```
Your tools can then access these variables:
```bash title="tools/morale_boost/bin/print_cat_with_api"
#!/bin/bash
if [ -n "$GITHUB_TOKEN" ]; then
echo "🐱 Connected to GitHub! Ready to boost morale across repos!"
else
echo "🐱 Just a local morale booster today!"
fi
```
However, we mostly recommend to use the python bindings of the `registry` bundle for keeping internal
state (instead of using environment variables).
### Using the Registry Bundle
The registry bundle provides a persistent key-value store that survives across tool calls. This is better than environment variables because
you can store complex data structures (lists, dictionaries) as JSON.
**Setting registry variables in your config:**
```yaml title="config/my_custom_config.yaml"
agent:
tools:
registry_variables:
MY_CUSTOM_SETTING: "hello world"
MORALE_MESSAGES:
- "You're doing great!"
- "Keep up the good work!"
- "Almost there!"
DEBUG_MODE: true
bundles:
- path: tools/registry # Always include this first!
- path: tools/morale_boost
```
**Accessing registry variables in your Python tools:**
```python title="tools/morale_boost/bin/print_motivational_cat"
#!/usr/bin/env python3
from registry import registry
def main():
# Get a simple value with a fallback
setting = registry.get("MY_CUSTOM_SETTING", "default value")
print(f"Setting: {setting}")
# Get a list of messages
messages = registry.get("MORALE_MESSAGES", [])
if messages:
import random
message = random.choice(messages)
print(f"🐱 {message}")
# Set a value (persists across tool calls)
registry["LAST_MORALE_BOOST"] = "2024-01-15 10:30:00"
print("Morale boosted! 🚀")
if __name__ == "__main__":
main()
```
**Accessing registry variables in bash tools:**
```bash title="tools/morale_boost/bin/print_simple_cat"
#!/bin/bash
# Read from registry using the _read_env helper
CUSTOM_SETTING=$(_read_env "MY_CUSTOM_SETTING" "default value")
DEBUG_MODE=$(_read_env "DEBUG_MODE" "false")
echo "🐱 Custom setting: $CUSTOM_SETTING"
if [ "$DEBUG_MODE" = "true" ]; then
echo "Debug mode is enabled!"
fi
```
The registry is particularly useful for complex tools that need to maintain state across multiple invocations, like the `review_on_submit_m` tool that tracks submission stages.
## State commands and more
Take a look at our [tool documentation](../config/tools.md).
{% include-markdown "../_footer.md" %}

View File

@@ -0,0 +1,231 @@
# Batch mode
!!! abstract "Running on many issues at once"
You have used `sweagent run`. To become a real power user, we'll convert you to `sweagent run-batch` and you can run on a hundred issues at once.
* Please make sure you're familiar with [the command line basics](cl_tutorial.md).
* The default examples will be executing code in a Docker sandbox, so make sure you have docker installed ([docker troubleshooting](../installation/tips.md)).
If you cannot run docker, skim through the examples below and adapt accordingly.
## A first example: SWE-bench
So you've decided to run SWE-agent on a lot of issues in parallel. Great, the `run-batch` command is exactly here for that.
Let's run on three [SWE-bench](https://www.swebench.com/) issues which will be downloaded automatically.
```bash
sweagent run-batch \
--config config/default.yaml \
--agent.model.name gpt-4o \
--agent.model.per_instance_cost_limit 2.00 \
--instances.type swe_bench \
--instances.subset lite \
--instances.split dev \
--instances.slice :3 \
--instances.shuffle=True
```
Let's look at the options:
1. `--instances.type swe_bench`: There's a couple of built-in ways to configure instances. This option selects the SWE-bench dataset.
2. `--instances.subset lite`: There's a few datasets provided by the SWE-bench project. Lite is a subset of GitHub issues with a few heuristic filters that makes them more likely to be solvable.
3. `--instances.split dev`: Most datasets have a `dev` and a `test` split.
4. `--instances.slice :3`: The `--slice` option allows you to select a subset of instances from the dataset. It works just the way to pythons `list[...]` slicing, so you can specify `:10` to take the first 10 instances, `10:20` to take the next 10, `-10:` to take the last 10, or `10:20:2` to take every second instance in that range.
5. `--instances.shuffle=True`: Shuffle all instances before slicing. This is a deterministic operation, so the same command will always return the same instances in the same order.
* There's some things that you should recognize: All of the `--agent` options are available and you can still specify `--config` files.
* However, the `--problem_statement`, `--repo`, and `--env` options obviously need to change, because you now want to populate these settings automatically from a source.
This is where the new option comes in: `--instances`, specifying the **instance source** together with a few options.
!!! tip "Tooltips"
Click on the :material-chevron-right-circle: icon in the right margin of the code snippet to see more information about the line.
The output should remind you a lot like the output of the [hello world tutorial](hello_world.md), except for the progress bar at the bottom.
Kind of slow, isn't it?
!!! tip "All command line options"
* See [`RunBatchConfig`](../reference/run_batch_config.md#sweagent.run.run_batch.RunBatchConfig) for an overview of all options.
* SWE-bench config: [`SWEBenchInstances`](../reference/batch_instances.md#sweagent.run.batch_instances.SWEBenchInstances).
!!! tip "Evaluating on SWE-bench"
If you are using [`sb-cli`](https://www.swebench.com/sb-cli/), you can automatically evaluate on SWE-bench by adding the `--evaluate=True` flag.
This will already submit submissions to `sb-cli` while you are running, so that you should receive results within a minute of finishing your run.
## Multimodal SWE-bench
SWE-agent supports the **SWE-bench Multimodal** dataset, which includes GitHub issues with associated images (screenshots, diagrams, UI mockups). To run on multimodal instances:
```bash
sweagent run-batch \
--config config/default_mm_with_images.yaml \
--agent.model.name claude-sonnet-4-20250514 \
--agent.model.per_instance_cost_limit 2.00 \
--instances.type swe_bench \
--instances.subset multimodal \
--instances.split dev \
--instances.slice :3 \
--instances.shuffle=True
```
Key differences for multimodal runs:
- **Configuration**: Use `config/default_mm_with_images.yaml` which includes image processing capabilities
- **Subset**: Use `--instances.subset multimodal` to access the multimodal dataset
- **Token limits**: Consider higher cost limits as images consume more tokens
- **Multimodal Tools**: `tools/image_tools` and `tools/web_browser` include useful tools for viewing images and web browsers
The system automatically:
- Downloads images from GitHub issue URLs
- Converts them to base64 markdown format
- Provides visual context to the AI model
!!! tip "Multimodal Configuration"
See the [multimodal guide](multimodal.md) for detailed configuration options and troubleshooting.
## Running in parallel
Let's speed things up and run on 5 instances at once. Only a single line to change:
```bash hl_lines="4-4"
sweagent run-batch \
--config config/default.yaml \
--agent.model.name gpt-4o \
--num_workers 3 \
--agent.model.per_instance_cost_limit 2.00 \
--instances.type swe_bench \
--instances.subset lite \
--instances.split dev \
--instances.slice :3 \
--instances.shuffle=True
```
You'll see output that looks like this (only with 3 workers instead of 30):
<figure markdown="span">
![SWE-ReX in action](../assets/batch_animated.gif){: style="max-width: 500px; border-radius: 10px;"}
<figcaption>SWE-Agent running on 30 SWE-bench instances</figcaption>
</figure>
!!! tip "All command line options"
See [`RunBatchConfig`](../reference/run_batch_config.md#sweagent.run.run_batch.RunBatchConfig) for an overview of all options.
When starting a lot of parallel instances with the docker backend, it might happen that you see some bottleneck effects
(e.g., when running on a platform with few CPUs, you might see some timeouts because there's not enough CPUs to handle the startup of all containers in time).
In this case, please set `--random_delay_multiplier` to e.g., 1. This means that every worker will wait a random time between `0s` and `1s * #workers` before starting,
thereby easing CPU pressure. Default is 0.3.
## Loading instances from a file
```bash
sweagent run-batch \
--config config/default.yaml \
--agent.model.name gpt-4o \
--instances.type file \
--instances.path instances.yaml \
--instances.slice :3 \
--instances.shuffle=True
```
`--instances.path` supports `.jsonl`, `.json`, and `.yaml` files.
Here'the simplest example of what such a file can look like
```yaml title="instances.yaml"
- image_name: "python:3.11" # (1)!
problem_statement: "A simple test problem"
instance_id: "simple_test_problem"
- image_name: "python:3.11"
problem_statement: "Another test problem"
instance_id: "simple_test_problem_2"
```
1. Must be available locally or on dockerhub.
!!! warning "Recent changes"
The `instance_id` key was named `id` until Mar 16, 2025.
This was changed to add compatibility with the standard SWE-bench format.
However, we temporarily support both names.
!!! tip "More options"
* There's a few more fields that you can populate. See [`SimpleBatchInstances`](../reference/batch_instances.md#sweagent.run.batch_instances.SimpleBatchInstance) for more information.
* For all command line options with this instance type, see [`InstancesFromFile`](../reference/batch_instances.md#sweagent.run.batch_instances.InstancesFromFile).
## Huggingface instances
If you upload your dataset to Huggingface in a similar style as the example we just saw, you can load it like this:
```bash
sweagent run-batch \
...
--instances.type huggingface \
--instances.dataset_name "your_username/your_dataset" \
--instances.split "dev" \
--instances.slice :3 \
--instances.shuffle=True
```
!!! tip "All instance options"
See [`InstancesFromHuggingFace`](../reference/batch_instances.md#sweagent.run.batch_instances.InstancesFromHuggingFace).
## Expert instances
If this is not enough for your use case, you can also specify a full set of `Environment`, `ProblemStatement`, and `Repository` config objects:
```bash
sweagent run-batch \
...
--instances.type expert_file \
--instances.path instances.yaml
```
where `instances.yaml` could look like this:
```yaml title="instances.yaml"
- env:
deployment:
type: docker
image: python:3.11
repo:
type: github
github_url: "https://github.com/swe-agent/test-repo"
problem_statement:
type: text
text: "A simple test problem"
id: "simple_test_problem"
- env:
deployment:
type: docker
image: python:3.11
problem_statement:
type: text
text: "A simple test problem 2"
id: "simple_test_problem_2"
```
!!! tip "All instance options"
See [`ExpertInstances`](../reference/batch_instances.md#sweagent.run.batch_instances.ExpertInstancesFromFile).
## Output files and next steps
All patches generated (all submissions/predictions of the agent) are saved to a `preds.json` file.
If you interrupt `sweagent run-batch`, some of these or the file itself might be missing.
You can use the `sweagent merge-preds` utility to fix this.
The `preds.json` file is very similar to the `.jsonl` format that is used for SWE-bench local runs.
You can convert between the formats with
```python
from pathlib import Path
import json
preds = json.loads(Path("preds.json").read_text())
data = [{"instance_id": key, **value} for key, value in preds.items()]
jsonl = [json.dumps(d) for d in data]
Path("all_preds.jsonl").write_text("\\n".join(jsonl))
```
!!! tip "Next up"
Take a look at our [competitive runs tutorial](competitive_runs.md)
for more information on running on SWE-Bench and similar benchmarks.

View File

@@ -0,0 +1,299 @@
# Command line basics
!!! abstract "Command line basics"
This tutorial walks you through running SWE-agent from the command line.
* Please read our [hello world](hello_world.md) tutorial before proceeding.
* This tutorial focuses on using SWE-agent as a tool to solve individual issues.
Benchmarking SWE-agent is covered [separately](batch_mode.md).
Finally, we have a different tutorial for using SWE-agent for [coding challenges](coding_challenges.md).
!!! tip "Mini-SWE-Agent"
Looking for a simple, no-fuzz version of SWE-agent that can also help you in your daily work?
Check out [Mini-SWE-Agent](https://mini-swe-agent.com/)!
## A few examples
Before we start with a more structured explanation of the command line options, here are a few examples that you might find immediately useful:
```bash title="Fix a github issue"
sweagent run \
--agent.model.name=gpt-4o \
--agent.model.per_instance_cost_limit=2.00 \
--env.repo.github_url=https://github.com/SWE-agent/test-repo \
--problem_statement.github_url=https://github.com/SWE-agent/test-repo/issues/1
```
```bash title="Work on a github repo with a custom problem statement" hl_lines="4"
sweagent run \
...
--env.repo.github_url=https://github.com/SWE-agent/test-repo \
--problem_statement.text="Hey, can you fix all the bugs?"
```
```bash title="Fix a bug in a local repository using a custom docker image" hl_lines="4 5 6"
git clone https://github.com/SWE-agent/test-repo.git
sweagent run \
--agent.model.name=claude-sonnet-4-20250514 \
--env.repo.path=test-repo \
--problem_statement.path=test-repo/problem_statements/1.md \
--env.deployment.image=python:3.12
```
1. Make sure to add anthropic keys (or keys for your model provider) to the environment for this one!
2. `--env.deployment.image` points to the [dockerhub image](https://hub.docker.com/_/python) of the same name
For the next example, we will use a cloud-based execution environment instead of using local docker containers.
For this, you first need to set up a modal account, install the necessary extra dependencies `pip install 'swe-rex[modal]'`, then run:
```bash title="Deployment on modal (cloud-based execution)" hl_lines="3"
sweagent run \
...
--env.deployment.type=modal \
--env.deployment.image=python:3.12
```
!!! tip "All options"
Run `sweagent run --help` to see all available options for `run.py`. This tutorial will only cover a subset of options.
## Configuration files
All configuration options can be specified either in one or more `.yaml` files, or as command line arguments. For example, our first command can be written as
=== "Command line"
```bash
sweagent run --config my_run.yaml
```
=== "Configuration file"
```yaml title="my_run.yaml"
agent:
model:
name: gpt-4o
per_instance_cost_limit: 2.00
env:
repo:
github_url: https://github.com/SWE-agent/test-repo
problem_statement:
github_url: https://github.com/SWE-agent/test-repo/issues/1
```
But we can also split it up into multiple files and additional command line options:
=== "Command line"
```bash
# Note that you need --config in front of every config file
sweagent run --config agent.yaml --config env.yaml \
--problem_statement.text="Hey, can you fix all the bugs?"
```
=== "`agent.yaml`"
```yaml title="agent.yaml"
agent:
model:
name: gpt-4o
per_instance_cost_limit: 2.00
```
=== "`env.yaml`"
```yaml title="env.yaml"
env:
repo:
github_url: https://github.com/SWE-agent/test-repo
```
!!! warning "Multiple config files"
Prior to version SWE-agent 1.1.0, configs were merged with simple dictionary updates,
rather than a hierarchical merge, so specifying `agent` (or any key with subkeys) in the
second config would completely overwrite all `agent` settings of the first config.
This is fixed since SWE-agent 1.1.0.
The default config file is `config/default.yaml`. Let's take a look at it:
<details>
<summary>Example: default config <code>default.yaml</code></summary>
```yaml
--8<-- "config/default.yaml"
```
</details>
As you can see, this is where all the templates are defined!
This file is also loaded when no other `--config` options are specified.
So to make sure that we get the default templates in the above examples with `--config`, we should have added
```bash
--config config/default.yaml
```
in addition to all the other `--config` options for the two examples above.
## Problem statements and union types <a id="union-types"></a>
!!! note "Operating in batch mode: Running on SWE-bench and other benchmark sets"
If you want to run SWE-agent in batch mode on SWE-bench or another whole evaluation set, see
[batch mode](batch_mode.md). This tutorial focuses on using SWE-agent on
individual issues.
We've already seen a few examples of how to specify the problem to solve, namely
```bash
--problem_statement.data_path /path/to/problem.md
--problem_statement.repo_path /path/to/repo
--problem_statement.text="..."
```
Each of these types of problems can have specific configuration options.
To understand how this works, we'll need to understand **union types**.
Running `sweagent run` builds up a configuration object that essentially looks like this:
```yaml
agent: AgentConfig
env: EnvironmentConfig
problem_statement: TextProblemStatement | GithubIssue | FileProblemStatement # (1)!
```
1. This is a union type, meaning that the problem statement can be one of the three types.
Each of these configuration objects has its own set of options:
* [`GithubIssue`](../reference/problem_statements.md#sweagent.agent.problem_statement.GithubIssue)
* [`TextProblemStatement`](../reference/problem_statements.md#sweagent.agent.problem_statement.TextProblemStatement)
* [`FileProblemStatement`](../reference/problem_statements.md#sweagent.agent.problem_statement.FileProblemStatement)
So how do we know which configuration object to initialize?
It's simple: Each of these types has a different set of required options (e.g., `github_url` is required for `GithubIssue`, but not for `TextProblemStatement`).
SWE-agent will automatically select the correct configuration object based on the command line options you provide.
However, you can also explicitly specify the type of problem statement you want to use by adding a `--problem_statement.type` option.
!!! tip "Union type errors"
If you ever ran a SWE-agent command and got a very long error message about various configuration options not working, it is because for union types.
If everything works correctly, we try to initialize every option until we find the one that works based on your inputs (for example stopping at `TextProblemStatement` if you provided a `--problem_statement.text`).
However, if none of them work, we throw an error which then tells you why we cannot initialize any of the types (so it will tell you that `github_url` is required for `GithubIssue`, even though you might not even have tried to work on a GitHub issue).
<details>Example union type errors
<summary>Example union type errors</summary>
This is the output of running
```bash
sweagent run --problem_statement.path="test" --problem_statement.github_url="asdf"
```
```
--8<-- "docs/usage/union_type_error.txt"
```
</details>
If you want to read more about how this works, check out the [pydantic docs](https://docs.pydantic.dev/latest/concepts/unions/).
## Specifying the repository
The repository can be specified in a few different ways:
```bash
--env.repo.github_url=https://github.com/SWE-agent/test-repo
--env.repo.path=/path/to/repo
```
Again, those are [union types](#union-types). See here for all the options:
* [`GithubRepoConfig`](../reference/repo.md#sweagent.environment.repo.GithubRepoConfig): Pull a repository from GitHub.
* [`LocalRepoConfig`](../reference/repo.md#sweagent.environment.repo.LocalRepoConfig): Copies a repository from your local filesystem to the docker container.
* [`PreExistingRepoConfig`](../reference/repo.md#sweagent.environment.repo.PreExistingRepoConfig): If you want to use a repository that already exists on the docker container.
## Configuring the environment
We mainly recommend you to build a docker image with all the dependencies you need and then use that with `--env.deployment.image`.
In addition, you can also execute additional commands before starting the agent with `env.post_startup_commands`, which takes a list of commands, e.g.,
```bash
sweagent run \
--agent.model.name=claude-3-7-sonnet-latest \
--env.post_startup_commands='["pip install flake8"]' \
...
```
Note the list syntax that is passed as a string using single ticks `'`. This is particularly important for `zsh` where `[`, `]` have special meaning.
Here's an example of a custom docker environment (it's also available in the repo as `docker/tiny_test.Dockerfile`):
<!-- There's a dockerfile annotation, but it somehow breaks annotations -->
```bash title="tiny_test.Dockerfile"
FROM python:3.11.10-bullseye # (1)!
ARG DEBIAN_FRONTEND=noninteractive # (2)!
ENV TZ=Etc/UTC # (3)!
WORKDIR /
# SWE-ReX will always attempt to install its server into your docker container
# however, this takes a couple of seconds. If we already provide it in the image,
# this is much faster.
RUN pip install pipx
RUN pipx install swe-rex # (4)!
RUN pipx ensurepath # (5)!
RUN pip install flake8 # (6)!
SHELL ["/bin/bash", "-c"]
# This is where pipx installs things
ENV PATH="$PATH:/root/.local/bin/" # (7)!
```
Click on the :material-chevron-right-circle: icon in the right margin of the code snippet to see more information about the lines.
1. This is the base image.
2. This is to avoid any interactive prompts from the package manager.
3. Again, this avoids interactive prompts
4. SWE-ReX is our execution backend. We start a small server within the container, which receives
commands from the agent and executes them.
5. This ensures that the path where pipx installs things is in the `$PATH` variable.
6. This is to install flake8, which is used by some of our edit tools.
7. Unfortunately, step 5 sometimes still doesn't properly add the SWE-ReX server to the `$PATH` variable.
So we do it here again.
## Taking actions
* You can use `--actions.apply_patch_locally` to have SWE-agent apply successful solution attempts to local files.
* Alternatively, when running on a GitHub issue, you can have the agent automatically open a PR if the issue has been solved by supplying the `--actions.open_pr` flag.
Please use this feature responsibly (on your own repositories or after careful consideration).
!!! tip "All action options"
See [`RunSingleActionConfig`](../reference/run_single_config.md#sweagent.run.run_single.RunSingleActionConfig) for all action options.
Alternatively, you can always retrieve the patch that was generated by SWE-agent.
Watch out for the following message in the log:
```
╭──────────────────────────── 🎉 Submission successful 🎉 ────────────────────────────╮
│ SWE-agent has produced a patch that it believes will solve the issue you submitted! │
│ Use the code snippet below to inspect or apply it! │
╰─────────────────────────────────────────────────────────────────────────────────────╯
```
And follow the instructions below it:
```bash
# The patch has been saved to your local filesystem at:
PATCH_FILE_PATH='/Users/.../patches/05917d.patch'
# Inspect it:
cat "${PATCH_FILE_PATH}"
# Apply it to a local repository:
cd <your local repo root>
git apply "${PATCH_FILE_PATH}"
```
{% include-markdown "../_footer.md" %}

View File

@@ -0,0 +1,23 @@
# SWE-agent command line interface
All functionality of SWE-agent is available via the command line interface via the `sweagent` command.
You can run `sweagent --help` to see all subcommands.
## Running SWE-agent
* `sweagent run`: Run SWE-agent on a single issue ([tutorial](hello_world.md)).
* `sweagent run-batch`: Run SWE-agent on a batch of issues ([tutorial](batch_mode.md)).
* `sweagent run-replay`: Replay a trajectory file or a demo file. This means that you take all actions from the trajectory and execute them again in the environment. Useful for debugging your [tools](../config/tools.md) or for building new [demonstrations](../config/demonstrations.md).
## Inspecting runs
* `sweagent inspect` or `sweagent i`: Open the command line inspector ([more information](inspector.md)).
* `sweagent inspector` or `sweagent I`: Open the web-based inspector ([more information](inspector.md)).
* `sweagent quick-stats` or `sweagent qs`: When executed in a directory with trajectories, displays a summary of `exit_status` and more
## Advanced scripts
* `sweagent merge-preds`: Merge multiple prediction files into a single file.
* `sweagent traj-to-demo`: Convert a trajectory file to an easy to edit demo file ([more information on demonstrations](../config/demonstrations.md)).
* `sweagent remove-unfinished`: Remove unfinished trajectories

View File

@@ -0,0 +1,157 @@
# Using SWE-agent for coding challenges
!!! abstract "Overview"
It is easy to use SWE-agent to do more than just software engineering.
For example, you can tell SWE-agent to work on leetcode or humaneval-style problems.
Required reading:
Please first skim [hello world](hello_world.md) and [command line basics](cl_tutorial.md) to get familiar with the basics of SWE-agent.
Let's start with a new problem statement. For this, put the problem you want to solve in a markdown file `problem.md`, for example:
<details>
<summary>Example leetcode challenge</summary>
This is the <a href="https://leetcode.com/problems/first-missing-positive/">first missing positive</a> challenge.
```markdown
--8<-- "docs/usage/leetcode_example.md"
```
</details>
Second, we need to specify a repository wherein SWE-agent will work.
Here, we can simply create an empty folder (outside of the SWE-agent repository), and add a `main.py` file
```bash
mkdir empty
git init
touch main.py
echo "*.pyc" > .gitignore # to avoid binary files in patches
```
and potentially populate it with the problem stub
```python
from typing import List
class Solution:
def firstMissingPositive(self, nums: List[int]) -> int:
```
!!! tip
If some imports (like `List`) are missing in the problem stub (like they oftentimes do
in leetcode) , SWE-agent will figure out how to add them. However, it might take an
additional step, so it's best to directly specify them.
Make sure to commit all changes to the repository:
```bash
git add . && git commit -m "Add problem stub"
```
Now, we can let SWE-agent solve the problem:
```bash
sweagent run \
--config config/coding_challenge.yaml \
--problem_statement.path=problem.md \
--env.repo.path=/path/to/empty \
--agent.model.name=gpt-4o \
--agent.model.per_instance_cost_limit 3.0 \
--actions.apply_patch_locally=True
```
<details>
<summary>Output</summary>
```
--8<-- "docs/usage/coding_challenge_output.log"
```
</details>
SWE-agent will typically conclude with a message like
```
INFO Trajectory saved to trajectories/fuchur/azure-gpt4__problem__coding_challenge__t-0.00__p-0.95__c-3.00__install-1/26d111.traj
INFO Saved predictions to
trajectories/fuchur/azure-gpt4__problem__coding_challenge__t-0.00__p-0.95__c-3.00__install-1/all_preds.jsonl
╭──────────────────────────── 🎉 Submission successful 🎉 ────────────────────────────╮
│ SWE-agent has produced a patch that it believes will solve the issue you submitted! │
│ Use the code snippet below to inspect or apply it! │
╰─────────────────────────────────────────────────────────────────────────────────────╯
# The patch has been saved to your local filesystem at:
PATCH_FILE_PATH='/Users/fuchur/Documents/24/git_sync/SWE-agent/trajectories/fuchur/azure-gpt4__problem__coding_challenge__t-0.00__p-0
5__c-3.00__install-1/patches/26d111.patch'
# Inspect it:
cat "${PATCH_FILE_PATH}"
# Apply it to a local repository:
cd <your local repo root>
git apply "${PATCH_FILE_PATH}"
```
This informs you of the location of the [trajectory](trajectories.md) and of the patch file that contains the solution.
In our case, this looks like this:
<details>
<summary>Patch (solution)</summary>
```patch
diff --git a/main.py b/main.py
index 1e7f08f..beabaa7 100644
--- a/main.py
+++ b/main.py
@@ -2,4 +2,37 @@ from typing import List
class Solution:
- def firstMissingPositive(self, nums: List[int]) -> int:
+ # Implementation of the function to find the smallest missing positive integer
+ def firstMissingPositive(self, nums: List[int]) -> int:
+ n = len(nums)
+ # Mark numbers (num < 0) and (num > n) with a special marker number (n+1)
+ # We can ignore those because if missing number is in the range (1, n+1)
+ # it must be in the array
+ for i in range(n):
+ if nums[i] <= 0 or nums[i] > n:
+ nums[i] = n + 1
+
+ # Use index as a hash key and number sign as a presence detector
+ # For example, if nums[1] is negative that means that number `1` is present in the array
+ for i in range(n):
+ num = abs(nums[i])
+ if num <= n:
+ nums[num - 1] = -abs(nums[num - 1])
+
+ # Now the index of the first positive number is equal to the smallest missing positive number
+ for i in range(n):
+ if nums[i] > 0:
+ return i + 1
+
+ # If no positive numbers were found, the smallest missing positive is n+1
+ return n + 1
+# Adding test cases to verify the implementation of firstMissingPositive
+if __name__ == "__main__":
+ sol = Solution()
+ # Test case 1
+ assert sol.firstMissingPositive([1, 2, 0]) == 3, "Test case 1 failed"
+ # Test case 2
+ assert sol.firstMissingPositive([3, 4, -1, 1]) == 2, "Test case 2 failed"
+ # Test case 3
+ assert sol.firstMissingPositive([7, 8, 9, 11, 12]) == 1, "Test case 3 failed"
+ print("All test cases passed successfully.")
```
</details>
Because of the `--apply_patch_locally` flag, the patch has also already been applied to the repository,
so you can also retrieve the final solution from there.
### Improving SWE-agent for coding challenges
By default, the demonstration trajectory the agent uses while solving a coding challenge is one in which it needs to solve a small bug in a short piece of code (from the HumanEvalFix dataset). Since that process is not too similar to solving a coding challenge, performance would probably substantially improve if the agent was given a demonstration trajectory in which it has to solve an actual programming challenge. To learn how to do that, read [this](../config/demonstrations.md).

View File

@@ -0,0 +1,87 @@
# Competitive runs
!!! abstract "Running swe-agent competitively on benchmarks"
This page contains information on our competitive runs on SWE-bench, as well as tips and tricks for evaluating on large batches.
* Please make sure you're familiar with [the command line basics](cl_tutorial.md) and the [batch mode](batch_mode.md)
* The default examples will be executing code in a Docker sandbox, so make sure you have docker installed ([docker troubleshooting](../installation/tips.md)).
## Current competitive configurations
!!! hint "Most recent configs"
You can find all benchmark submission configs [here](https://github.com/SWE-agent/SWE-agent/tree/main/config/benchmarks)
Examples of configurations for SWE-bench submissions:
* [250225_anthropic_filemap_simple_review.yaml](https://github.com/SWE-agent/SWE-agent/blob/main/config/250225_anthropic_filemap_simple_review.yaml):
This is our current default one-attempt config. It uses `claude-3-7-sonnet-20250219`.
* [250212_sweagent_heavy_sbl.yaml](https://github.com/SWE-agent/SWE-agent/blob/main/config/250212_sweagent_heavy_sbl.yaml):
This config runs 5 attempts with slightly different configurations using `claude-3-7-sonnet-latest`,
then uses o1 to discriminate between them.
This is a very expensive configuration.
If you use it, also make sure to use Claude 3.7 instead of claude 3.5.
!!! warning "Retry configurations and command line arguments"
Note that the structure of the configuration with agents that run multiple attempts is different from the one of the
default agent. In particular, supplying options like `--agent.model.name` etc. will cause (potentially confusing)
error messages. Take a look at the above configuration file to see the structure!
You can find the command with which to run each config at the top of the config file.
In order to run on multiple workers with Claude, you need to use multiple API keys in order to have enough cache break points.
For this, please set the following environment variable before running
```bash
# concatenate your keys
export CLAUDE_API_KEY_ROTATION="KEY1:::KEY2:::KEY3"
```
See our [notes on Claude](../config/models.md) for more details.
## Memory consumption
We run our configuration on a machine with 32GB memory and 8 cores.
To avoid out-of-memory (OOM) situations, we recommend setting
```bash
--instances.deployment.docker_args=--memory=10g
```
limiting the maximum amount of memory per worker.
In our case, this completely avoided any instances of running OOM.
However, OOM situations can potentially lock you out of the server, so
you might want to use a script like the following as a second layer
defense to kill any process that hogs too much memory (note that this will affect _any_ script and not just swe-agent):
<details>
<summary>Memory sentinel</summary>
```python
--8<-- "docs/usage/memory_sentinel.py"
```
</details>
If swe-agent dies or you frequently abort it, you might have leftover docker containers
(they are cleaned up by normal termination of swe-agent but can be left over if it is killed).
You can use a sentinel script like the following to clean them up periodically
(note that this will affect _any_ long running container and not just those from swe-agent):
<details>
<summary>Container sentinel</summary>
```bash
--8<-- "docs/usage/containers_sentinel.sh"
```
</details>
## Tradeoffs between resolution rate and cost
* Running multi-attempt configurations will always be _very_ expensive. Don't use them if cost is of importance.
* The simplest setting to keep cost in check is the per instance cost limit or turn limit.
Without limiting cost, the average cost will also converge to infinity, as the agent will never stop iterating.
With Claude 3.7, a cost-conservative limit would be $1 instance limit or lower and a turn count limit of 50.
For our swe-bench submissions we use slightly higher limits (see the configs above).

View File

@@ -0,0 +1,22 @@
#!/bin/bash
while true; do
echo "Checking for long-running containers..."
# List all running containers with their uptime
docker ps --format "{{.ID}} {{.RunningFor}}" | while read -r id running_for; do
# Extract the number and unit from the running time
if [[ $running_for =~ ([0-9]+)\ (hour|hours) ]]; then
hours=${BASH_REMATCH[1]}
if (( hours >= 2 )); then
echo "Killing container $id (running for $running_for)..."
docker kill "$id"
fi
elif [[ $running_for =~ ([0-9]+)\ (day|days) ]]; then
# If it's running for at least a day, it's definitely over 2 hours
echo "Killing container $id (running for $running_for)..."
docker kill "$id"
fi
done
echo "Sleeping for 10 minutes..."
sleep 600 # Wait 600 seconds (10 minutes) before running again
done

View File

@@ -0,0 +1,86 @@
# Hello world
!!! abstract "Fix a GitHub issue with SWE-agent"
In this tutorial, we will fix a GitHub issue with SWE-agent using the command line interface.
* Make sure you have [installed](../installation/index.md) SWE-agent and have a [language model](../installation/keys.md) set up.
* We will be executing code in a Docker sandbox, so make sure you have docker installed ([docker troubleshooting](../installation/tips.md)).
If you cannot run docker, skim this tutorial and see how you can run SWE-agent with cloud-based execution in the [command line basics tutorial](cl_tutorial.md).
!!! tip "Mini-SWE-Agent"
Looking for a simple, no-fuzz version of SWE-agent that can also help you in your daily work?
Check out [Mini-SWE-Agent](https://mini-swe-agent.com/)!
After installing SWE-agent, you have the `sweagent` command available. Run `sweagent --help` to see the list of subcommands.
The most important ones are
* `sweagent run`: Run SWE-agent on a single problem statement. This is covered on this page and for slightly more advanced examples in the [command line basics tutorial](cl_tutorial.md).
* `sweagent run-batch`: Run SWE-agent on a list of problem statements. This is what you would use for benchmarking, or when
working with a larger set of historic issues. Covered in the [batch mode tutorial](batch_mode.md).
In this tutorial, we will focus on the `run` subcommand.
Let's start with an absolutely trivial example and solve an issue about a simple syntax error ([`swe-agent/test-repo #1`](https://github.com/SWE-agent/test-repo/issues/1))
```bash
sweagent run \
--agent.model.name=claude-sonnet-4-20250514 \
--agent.model.per_instance_cost_limit=2.00 \
--env.repo.github_url=https://github.com/SWE-agent/test-repo \
--problem_statement.github_url=https://github.com/SWE-agent/test-repo/issues/1
```
The example above uses the `Claude Sonnet 4` model from Anthropic. Alternatively, you can for example use `GPT-4o` (from OpenAI)
by setting `--agent.model.name=gpt-4o`.
In order to use it, you need to add your keys to the environment:
```bash
export ANTHROPIC_API_KEY=<your key>
export OPENAI_API_KEY=<your key>
```
alternatively, you can create a `.env` file in your working directory and put your keys in there like so:
```bash
ANTHROPIC_API_KEY=<your key>
OPENAI_API_KEY=<your key>
```
We should support all models that you can think of.
!!! tip "Models and keys"
Read more about configuring [models and API keys](../installation/keys.md).
<details>
<summary>Output</summary>
```
--8<-- "docs/usage/hello_world_output.txt"
```
</details>
As you can see, the command line options are hierarchical. At the top level, there are three important sections:
* `problem_statement`: What problem are you trying to solve?
* `agent`: How do you want to solve the problem? This includes setting up the LM with `--agent.model`.
* `env`: What is the environment in which the problem statement should be solved?
This includes setting the repository/folder with the source files with `--env.repo`, as well as docker images and other dependencies.
This will also control where the code is executed (in a local container or in the cloud).
Watching the output, you can notice several stages:
1. Setting up the **deployment**: SWE-agent lets LMs execute actions in sandboxed environments. It can run these environments
in docker containers (default), on modal, AWS fargate, or directly on your computer (not recommended).
When the deployment starts, you will notice a "starting runtime" message that takes a few seconds. The runtime is
what is executing the commands within your deployment.
Deployments are managed by a package called [`SWE-ReX`](https://swe-rex.com/latest/).
2. Setting up [**tools**](../config/tools.md): The tools that you specified are copied and installed within the environment.
3. **System and instance prompts**: The initial instructions are shown to the LM. They are fully [configurable](../config/templates.md).
4. **Main loop**: The LM starts to suggest and execute actions.
5. **Submission**: The LM calls `submit` and we extract the patch (i.e., the changes to the source code that solve the problem).
The complete details of the run are saved as a ["trajectory" file](trajectories.md)). They can also be turned into new [demonstrations](../config/demonstrations.md) together with other log and output files.
Wetted your appetite? Head over to the [command line basics tutorial](cl_tutorial.md) to learn more about the options.

View File

@@ -0,0 +1,667 @@
👋 INFO This is SWE-agent version 1.0.1 (hash='5206f8371274adb34f4946f8bee92f14615a91e1') with SWE-ReX version 1.2.1 (rex_hash='b30056ae7add48ccddef038f34e2a049889eff64').
/opt/miniconda3/envs/swea13/lib/python3.13/site-packages/pydantic/_internal/_config.py:341: UserWarning: Valid config keys have changed in V2:
* 'fields' has been removed
warnings.warn(message, UserWarning)
🔧 INFO Loading default config from /Users/fuchur/Documents/24/git_sync/SWE-agent/config/anthropic_filemap.yaml, because no other config file is specified. Specify --no_config_file to disable this.
🔧 INFO Setting problem statement based on github issue url
🔧 INFO Loaded environment variables from /Users/fuchur/Documents/24/git_sync/SWE-agent/.env
🤖 WARN Claude 3.7 models do not support 128k context by default. Setting max output tokens to 64k. To enable 128k context, please set the completion_kwargs to {'extra_headers': {'anthropic-beta':
'output-128k-2025-02-19'}}.
🏃 INFO Starting environment
🦖 INFO Building image python:3.11 to install a standalone python to /root. This might take a while (but you only have to do it once). To skip this step, set `python_standalone_dir` to None.
🦖 DEBUG Found free port 54389
🦖 INFO Starting container python3.11-ab0dbfce-00dd-4b4c-bbdc-aa16cbc3590f with image python:3.11 serving on port 54389
🦖 DEBUG Command: "docker run --rm -p 54389:8000 --name python3.11-ab0dbfce-00dd-4b4c-bbdc-aa16cbc3590f sha256:9e1c9f1444e6352f8447e2acf9bb8ed31c3844ec059a517d3a8f05584276c535 /bin/sh -c
'/root/python3.11/bin/swerex-remote --auth-token 011937af-978b-49fa-9808-48408adcb0a2'"
🦖 INFO Starting runtime at 54389
🦖 INFO Runtime started in 1.07s
🪴 INFO Environment Initialized
🪴 DEBUG Resetting repository SWE-agent__test-repo to commit HEAD
🏃 INFO Running agent
🤠 INFO Setting up agent for instance SWE-agent__test-repo-i1
🤠 INFO Trajectory will be saved to
/Users/fuchur/Documents/24/git_sync/SWE-agent/trajectories/fuchur/anthropic_filemap__claude-3-7-sonnet-latest__t-0.00__p-1.00__c-2.00___SWE-agent__test-repo-i1/SWE-agent__test-repo-i1/SWE-agent__test-re
po-i1.traj
🪴 DEBUG No environment variables to set
🦖 DEBUG Uploading file from /Users/fuchur/Documents/24/git_sync/SWE-agent/tools/registry to /root/tools/registry
🦖 DEBUG Created zip file at /var/folders/g3/1y2_hl1j16scls01cxqm76s80000gn/T/tmp_37eron8/zipped_transfer.zip
🦖 DEBUG Uploading file from /Users/fuchur/Documents/24/git_sync/SWE-agent/tools/edit_anthropic to /root/tools/edit_anthropic
🦖 DEBUG Created zip file at /var/folders/g3/1y2_hl1j16scls01cxqm76s80000gn/T/tmpbtgk4i0v/zipped_transfer.zip
🦖 DEBUG Uploading file from /Users/fuchur/Documents/24/git_sync/SWE-agent/tools/review_on_submit_m to /root/tools/review_on_submit_m
🦖 DEBUG Created zip file at /var/folders/g3/1y2_hl1j16scls01cxqm76s80000gn/T/tmpud9n_fx7/zipped_transfer.zip
🧰 INFO Resetting tools
🪴 DEBUG No environment variables to set
🤠 INFO SYSTEM (main)
You are a helpful assistant that can interact with a computer to solve tasks.
🧰 DEBUG Retrieved state from environment: {'working_dir': '/SWE-agent__test-repo'}
🤠 INFO 🤖 MODEL INPUT
<uploaded_files>
/SWE-agent__test-repo
</uploaded_files>
I've uploaded a python code repository in the directory /SWE-agent__test-repo. Consider the following PR description:
<pr_description>
SyntaxError: invalid syntax
I'm running `missing_colon.py` as follows:
```python
division(23, 0)
```
but I get the following error:
```
File "/Users/fuchur/Documents/24/git_sync/swe-agent-test-repo/tests/./missing_colon.py", line 4
def division(a: float, b: float) -> float
^
SyntaxError: invalid syntax
```
</pr_description>
Can you help me implement the necessary changes to the repository so that the requirements specified in the <pr_description> are met?
I've already taken care of all changes to any of the test files described in the <pr_description>. This means you DON'T have to modify the testing logic or any of the tests in any way!
Your task is to make the minimal changes to non-tests files in the /SWE-agent__test-repo directory to ensure the <pr_description> is satisfied.
Follow these steps to resolve the issue:
1. As a first step, it might be a good idea to find and read code relevant to the <pr_description>
2. Create a script to reproduce the error and execute it with `python <filename.py>` using the bash tool, to confirm the error
3. Edit the sourcecode of the repo to resolve the issue
4. Rerun your reproduce script and confirm that the error is fixed!
5. Think about edgecases and make sure your fix handles them as well
Your thinking should be thorough and so it's fine if it's very long.
🤠 INFO ========================= STEP 1 =========================
🤖 DEBUG n_cache_control: 1
🤖 INFO Response: ModelResponse(id='chatcmpl-e4e139ba-eeb0-42cc-a764-ce3dd3cfd953', created=1746538400, model='claude-3-7-sonnet-latest', object='chat.completion', system_fingerprint=None,
choices=[Choices(finish_reason='tool_calls', index=0, message=Message(content="I'll help you implement the necessary changes to fix the SyntaxError in the repository. Let's follow the steps you
outlined.\n\n## Step 1: Find and read code relevant to the PR description\n\nFirst, let's explore the repository structure to locate the `missing_colon.py` file mentioned in the PR description.",
role='assistant', tool_calls=[ChatCompletionMessageToolCall(index=1, function=Function(arguments='{"command": "find /SWE-agent__test-repo -name \\"missing_colon.py\\""}', name='bash'),
id='toolu_01YUryV1Xd2yqWF3Gd3ka2ct', type='function')], function_call=None))], usage=Usage(completion_tokens=141, prompt_tokens=1553, total_tokens=1694, completion_tokens_details=None,
prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=0, text_tokens=None, image_tokens=None), cache_creation_input_tokens=1549, cache_read_input_tokens=0))
🤖 DEBUG input_tokens=392, output_tokens=64, instance_cost=0.01, cost=0.01
🤖 DEBUG total_tokens_sent=392, total_tokens_received=64, total_cost=0.01, total_api_calls=1
🤠 INFO 💭 THOUGHT
I'll help you implement the necessary changes to fix the SyntaxError in the repository. Let's follow the steps you outlined.
## Step 1: Find and read code relevant to the PR description
First, let's explore the repository structure to locate the `missing_colon.py` file mentioned in the PR description.
🎬 ACTION
find /SWE-agent__test-repo -name "missing_colon.py"
🧰 DEBUG Retrieved state from environment: {'working_dir': '/SWE-agent__test-repo'}
🤠 INFO 🤖 MODEL INPUT
OBSERVATION:
/SWE-agent__test-repo/tests/missing_colon.py
/SWE-agent__test-repo/src/testpkg/missing_colon.py
🤠 INFO ========================= STEP 2 =========================
🤖 DEBUG n_cache_control: 2
🤖 INFO Response: ModelResponse(id='chatcmpl-0ac17933-0d78-4445-b469-a4441fb57403', created=1746538404, model='claude-3-7-sonnet-latest', object='chat.completion', system_fingerprint=None,
choices=[Choices(finish_reason='tool_calls', index=0, message=Message(content="Great! I found two instances of `missing_colon.py`. Let's examine both files to understand what's going on.",
role='assistant', tool_calls=[ChatCompletionMessageToolCall(index=1, function=Function(arguments='{"command": "view", "path": "/SWE-agent__test-repo/tests/missing_colon.py"}',
name='str_replace_editor'), id='toolu_01HGHHFG2PHgdM3HtpfpmqMZ', type='function')], function_call=None))], usage=Usage(completion_tokens=119, prompt_tokens=1752, total_tokens=1871,
completion_tokens_details=None, prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=1549, text_tokens=None, image_tokens=None), cache_creation_input_tokens=197,
cache_read_input_tokens=1549))
🤖 DEBUG input_tokens=502, output_tokens=25, instance_cost=0.02, cost=0.00
🤖 DEBUG total_tokens_sent=894, total_tokens_received=89, total_cost=0.02, total_api_calls=2
🤠 INFO 💭 THOUGHT
Great! I found two instances of `missing_colon.py`. Let's examine both files to understand what's going on.
🎬 ACTION
str_replace_editor view /SWE-agent__test-repo/tests/missing_colon.py
🧰 DEBUG Retrieved state from environment: {'working_dir': '/SWE-agent__test-repo'}
🤠 INFO 🤖 MODEL INPUT
OBSERVATION:
Here's the result of running `cat -n` on /SWE-agent__test-repo/tests/missing_colon.py:
1 #!/usr/bin/env python3
2
3
4 def division(a: float, b: float) -> float
5 return a/b
6
7
8 if __name__ == "__main__":
9 print(division(123, 15))
10
11
🤠 INFO ========================= STEP 3 =========================
🤖 DEBUG n_cache_control: 2
🤖 INFO Response: ModelResponse(id='chatcmpl-34ea0f17-9082-436b-b832-ea1dcca638c8', created=1746538409, model='claude-3-7-sonnet-latest', object='chat.completion', system_fingerprint=None,
choices=[Choices(finish_reason='tool_calls', index=0, message=Message(content=None, role='assistant', tool_calls=[ChatCompletionMessageToolCall(index=0, function=Function(arguments='{"command": "view",
"path": "/SWE-agent__test-repo/src/testpkg/missing_colon.py"}', name='str_replace_editor'), id='toolu_01QKQhZWPgtvPcVkgMzHWecB', type='function')], function_call=None))],
usage=Usage(completion_tokens=95, prompt_tokens=2006, total_tokens=2101, completion_tokens_details=None, prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=1746,
text_tokens=None, image_tokens=None), cache_creation_input_tokens=254, cache_read_input_tokens=1746))
🤖 DEBUG input_tokens=666, output_tokens=0, instance_cost=0.02, cost=0.00
🤖 DEBUG total_tokens_sent=1,560, total_tokens_received=89, total_cost=0.02, total_api_calls=3
🤠 INFO 💭 THOUGHT
🎬 ACTION
str_replace_editor view /SWE-agent__test-repo/src/testpkg/missing_colon.py
🧰 DEBUG Retrieved state from environment: {'working_dir': '/SWE-agent__test-repo'}
🤠 INFO 🤖 MODEL INPUT
OBSERVATION:
Here's the result of running `cat -n` on /SWE-agent__test-repo/src/testpkg/missing_colon.py:
1 #!/usr/bin/env python3
2
3
4 def division(a: float, b: float) -> float
5 return a/b
6
7
8 if __name__ == "__main__":
9 print(division(123, 15))
10
11
🤠 INFO ========================= STEP 4 =========================
🤖 DEBUG n_cache_control: 2
🤖 INFO Response: ModelResponse(id='chatcmpl-65910764-3965-4117-a2da-dd14a7e6d711', created=1746538416, model='claude-3-7-sonnet-latest', object='chat.completion', system_fingerprint=None,
choices=[Choices(finish_reason='tool_calls', index=0, message=Message(content="I can see the issue now. In both files, there's a missing colon (`:`) at the end of the function definition on line 4. In
Python, function definitions need to end with a colon before the function body. The error message in the PR description points to this exact issue.\n\n## Step 2: Create a script to reproduce the
error\n\nLet's create a simple script to reproduce the error:", role='assistant', tool_calls=[ChatCompletionMessageToolCall(index=1, function=Function(arguments='{"command": "create", "path":
"/SWE-agent__test-repo/reproduce_error.py", "file_text": "#!/usr/bin/env python3\\n\\nimport sys\\nsys.path.append(\'/SWE-agent__test-repo\')\\nfrom tests.missing_colon import division\\n\\nif __name__
== \\"__main__\\":\\n try:\\n result = division(23, 0)\\n print(f\\"Result: {result}\\")\\n except Exception as e:\\n print(f\\"Error: {type(e).__name__}: {e}\\")"}',
name='str_replace_editor'), id='toolu_01AsaWoxeJrC44duFwrNejUM', type='function')], function_call=None))], usage=Usage(completion_tokens=295, prompt_tokens=2240, total_tokens=2535,
completion_tokens_details=None, prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=2000, text_tokens=None, image_tokens=None), cache_creation_input_tokens=234,
cache_read_input_tokens=2000))
🤖 DEBUG input_tokens=809, output_tokens=84, instance_cost=0.03, cost=0.01
🤖 DEBUG total_tokens_sent=2,369, total_tokens_received=173, total_cost=0.03, total_api_calls=4
🤠 INFO 💭 THOUGHT
I can see the issue now. In both files, there's a missing colon (`:`) at the end of the function definition on line 4. In Python, function definitions need to end with a colon before the function body.
The error message in the PR description points to this exact issue.
## Step 2: Create a script to reproduce the error
Let's create a simple script to reproduce the error:
🎬 ACTION
str_replace_editor create /SWE-agent__test-repo/reproduce_error.py --file_text '#!/usr/bin/env python3
import sys
sys.path.append('"'"'/SWE-agent__test-repo'"'"')
from tests.missing_colon import division
if __name__ == "__main__":
try:
result = division(23, 0)
print(f"Result: {result}")
except Exception as e:
print(f"Error: {type(e).__name__}: {e}")'
🧰 DEBUG Retrieved state from environment: {'working_dir': '/SWE-agent__test-repo'}
🤠 INFO 🤖 MODEL INPUT
OBSERVATION:
File created successfully at: /SWE-agent__test-repo/reproduce_error.py
🤠 INFO ========================= STEP 5 =========================
🤖 DEBUG n_cache_control: 2
🤖 INFO Response: ModelResponse(id='chatcmpl-74734059-3124-4d62-85a2-2f9c0dbafb67', created=1746538419, model='claude-3-7-sonnet-latest', object='chat.completion', system_fingerprint=None,
choices=[Choices(finish_reason='tool_calls', index=0, message=Message(content="Now let's run this script to confirm the error:", role='assistant', tool_calls=[ChatCompletionMessageToolCall(index=1,
function=Function(arguments='{"command": "cd /SWE-agent__test-repo && python reproduce_error.py"}', name='bash'), id='toolu_01DUDQaTLvo9wMtdoRuLzng2', type='function')], function_call=None))],
usage=Usage(completion_tokens=81, prompt_tokens=2570, total_tokens=2651, completion_tokens_details=None, prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=2234,
text_tokens=None, image_tokens=None), cache_creation_input_tokens=330, cache_read_input_tokens=2234))
🤖 DEBUG input_tokens=1,041, output_tokens=11, instance_cost=0.03, cost=0.00
🤖 DEBUG total_tokens_sent=3,410, total_tokens_received=184, total_cost=0.03, total_api_calls=5
🤠 INFO 💭 THOUGHT
Now let's run this script to confirm the error:
🎬 ACTION
cd /SWE-agent__test-repo && python reproduce_error.py
🧰 DEBUG Retrieved state from environment: {'working_dir': '/SWE-agent__test-repo'}
🤠 INFO 🤖 MODEL INPUT
OBSERVATION:
Traceback (most recent call last):
File "/SWE-agent__test-repo/reproduce_error.py", line 5, in <module>
from tests.missing_colon import division
File "/SWE-agent__test-repo/tests/missing_colon.py", line 4
def division(a: float, b: float) -> float
^
SyntaxError: expected ':'
🤠 INFO ========================= STEP 6 =========================
🤖 DEBUG n_cache_control: 2
🤖 INFO Response: ModelResponse(id='chatcmpl-7538e5bc-d5ae-4845-9bd1-aff6388c779e', created=1746538424, model='claude-3-7-sonnet-latest', object='chat.completion', system_fingerprint=None,
choices=[Choices(finish_reason='tool_calls', index=0, message=Message(content="Perfect! We've reproduced the error. The error message confirms that there's a missing colon at the end of the function
definition.\n\n## Step 3: Edit the source code to resolve the issue\n\nNow, let's fix both instances of the file. We need to add a colon at the end of the function definition on line 4.\n\nFirst, let's
fix the file in the tests directory:", role='assistant', tool_calls=[ChatCompletionMessageToolCall(index=1, function=Function(arguments='{"command": "str_replace", "path":
"/SWE-agent__test-repo/tests/missing_colon.py", "old_str": "def division(a: float, b: float) -> float\\n return a/b", "new_str": "def division(a: float, b: float) -> float:\\n return a/b"}',
name='str_replace_editor'), id='toolu_01XZBkQK2o3Z64j5oAc4uysv', type='function')], function_call=None))], usage=Usage(completion_tokens=255, prompt_tokens=2769, total_tokens=3024,
completion_tokens_details=None, prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=2564, text_tokens=None, image_tokens=None), cache_creation_input_tokens=199,
cache_read_input_tokens=2564))
🤖 DEBUG input_tokens=1,160, output_tokens=81, instance_cost=0.04, cost=0.01
🤖 DEBUG total_tokens_sent=4,570, total_tokens_received=265, total_cost=0.04, total_api_calls=6
🤠 INFO 💭 THOUGHT
Perfect! We've reproduced the error. The error message confirms that there's a missing colon at the end of the function definition.
## Step 3: Edit the source code to resolve the issue
Now, let's fix both instances of the file. We need to add a colon at the end of the function definition on line 4.
First, let's fix the file in the tests directory:
🎬 ACTION
str_replace_editor str_replace /SWE-agent__test-repo/tests/missing_colon.py --old_str 'def division(a: float, b: float) -> float
return a/b' --new_str 'def division(a: float, b: float) -> float:
return a/b'
🧰 DEBUG Retrieved state from environment: {'working_dir': '/SWE-agent__test-repo'}
🤠 INFO 🤖 MODEL INPUT
OBSERVATION:
The file /SWE-agent__test-repo/tests/missing_colon.py has been edited. Here's the result of running `cat -n` on a snippet of /SWE-agent__test-repo/tests/missing_colon.py:
1 #!/usr/bin/env python3
2
3
4 def division(a: float, b: float) -> float:
5 return a/b
6
7
8 if __name__ == "__main__":
Review the changes and make sure they are as expected. Edit the file again if necessary.
🤠 INFO ========================= STEP 7 =========================
🤖 DEBUG n_cache_control: 2
🤖 INFO Response: ModelResponse(id='chatcmpl-a89e5679-cfd2-48f8-988c-b515e960d3ac', created=1746538429, model='claude-3-7-sonnet-latest', object='chat.completion', system_fingerprint=None,
choices=[Choices(finish_reason='tool_calls', index=0, message=Message(content="Now, let's fix the file in the src directory:", role='assistant', tool_calls=[ChatCompletionMessageToolCall(index=1,
function=Function(arguments='{"command": "str_replace", "path": "/SWE-agent__test-repo/src/testpkg/missing_colon.py", "old_str": "def division(a: float, b: float) -> float\\n return a/b", "new_str":
"def division(a: float, b: float) -> float:\\n return a/b"}', name='str_replace_editor'), id='toolu_01FqQxXsasC7qavJozEmcba1', type='function')], function_call=None))],
usage=Usage(completion_tokens=185, prompt_tokens=3185, total_tokens=3370, completion_tokens_details=None, prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=2763,
text_tokens=None, image_tokens=None), cache_creation_input_tokens=417, cache_read_input_tokens=2763))
🤖 DEBUG input_tokens=1,449, output_tokens=12, instance_cost=0.04, cost=0.01
🤖 DEBUG total_tokens_sent=6,019, total_tokens_received=277, total_cost=0.04, total_api_calls=7
🤠 INFO 💭 THOUGHT
Now, let's fix the file in the src directory:
🎬 ACTION
str_replace_editor str_replace /SWE-agent__test-repo/src/testpkg/missing_colon.py --old_str 'def division(a: float, b: float) -> float
return a/b' --new_str 'def division(a: float, b: float) -> float:
return a/b'
🧰 DEBUG Retrieved state from environment: {'working_dir': '/SWE-agent__test-repo'}
🤠 INFO 🤖 MODEL INPUT
OBSERVATION:
The file /SWE-agent__test-repo/src/testpkg/missing_colon.py has been edited. Here's the result of running `cat -n` on a snippet of /SWE-agent__test-repo/src/testpkg/missing_colon.py:
1 #!/usr/bin/env python3
2
3
4 def division(a: float, b: float) -> float:
5 return a/b
6
7
8 if __name__ == "__main__":
Review the changes and make sure they are as expected. Edit the file again if necessary.
🤠 INFO ========================= STEP 8 =========================
🤖 DEBUG n_cache_control: 2
🤖 INFO Response: ModelResponse(id='chatcmpl-c0405095-900e-45b4-8ade-3dd0f7b9ae68', created=1746538432, model='claude-3-7-sonnet-latest', object='chat.completion', system_fingerprint=None,
choices=[Choices(finish_reason='tool_calls', index=0, message=Message(content="## Step 4: Rerun the script to confirm the fix\n\nLet's run our script again to confirm that the error is fixed:",
role='assistant', tool_calls=[ChatCompletionMessageToolCall(index=1, function=Function(arguments='{"command": "cd /SWE-agent__test-repo && python reproduce_error.py"}', name='bash'),
id='toolu_01D6GnG7M5dzoo6DrV2m4kKT', type='function')], function_call=None))], usage=Usage(completion_tokens=99, prompt_tokens=3539, total_tokens=3638, completion_tokens_details=None,
prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=3180, text_tokens=None, image_tokens=None), cache_creation_input_tokens=355, cache_read_input_tokens=3180))
🤖 DEBUG input_tokens=1,674, output_tokens=29, instance_cost=0.05, cost=0.00
🤖 DEBUG total_tokens_sent=7,693, total_tokens_received=306, total_cost=0.05, total_api_calls=8
🤠 INFO 💭 THOUGHT
## Step 4: Rerun the script to confirm the fix
Let's run our script again to confirm that the error is fixed:
🎬 ACTION
cd /SWE-agent__test-repo && python reproduce_error.py
🧰 DEBUG Retrieved state from environment: {'working_dir': '/SWE-agent__test-repo'}
🤠 INFO 🤖 MODEL INPUT
OBSERVATION:
Error: ZeroDivisionError: division by zero
🤠 INFO ========================= STEP 9 =========================
🤖 DEBUG n_cache_control: 2
🤖 INFO Response: ModelResponse(id='chatcmpl-06c071f0-2c38-44e4-845d-10f9f25a58d3', created=1746538442, model='claude-3-7-sonnet-latest', object='chat.completion', system_fingerprint=None,
choices=[Choices(finish_reason='tool_calls', index=0, message=Message(content="Great! The syntax error is fixed. Now we're getting a different error, which is expected because we're trying to divide by
zero. This confirms that our fix for the syntax error is working correctly.\n\n## Step 5: Think about edge cases and make sure the fix handles them\n\nLet's modify our script to test with valid inputs
to ensure the function works correctly:", role='assistant', tool_calls=[ChatCompletionMessageToolCall(index=1, function=Function(arguments='{"command": "str_replace", "path":
"/SWE-agent__test-repo/reproduce_error.py", "old_str": "if __name__ == \\"__main__\\":\\n try:\\n result = division(23, 0)\\n print(f\\"Result: {result}\\")\\n except Exception as
e:\\n print(f\\"Error: {type(e).__name__}: {e}\\")", "new_str": "if __name__ == \\"__main__\\":\\n # Test with valid inputs\\n try:\\n result = division(23, 5)\\n
print(f\\"Result of 23/5: {result}\\")\\n except Exception as e:\\n print(f\\"Error: {type(e).__name__}: {e}\\")\\n \\n # Test with zero division (edge case)\\n try:\\n result
= division(23, 0)\\n print(f\\"Result of 23/0: {result}\\")\\n except Exception as e:\\n print(f\\"Error: {type(e).__name__}: {e}\\")\\n \\n # Test with non-numeric inputs
(edge case)\\n try:\\n result = division(\\"23\\", 5)\\n print(f\\"Result of \'23\'/5: {result}\\")\\n except Exception as e:\\n print(f\\"Error: {type(e).__name__}:
{e}\\")"}', name='str_replace_editor'), id='toolu_01LsPJ5ZroMPv8nybnEePMkg', type='function')], function_call=None))], usage=Usage(completion_tokens=479, prompt_tokens=3665, total_tokens=4144,
completion_tokens_details=None, prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=3535, text_tokens=None, image_tokens=None), cache_creation_input_tokens=126,
cache_read_input_tokens=3535))
🤖 DEBUG input_tokens=1,736, output_tokens=75, instance_cost=0.06, cost=0.01
🤖 DEBUG total_tokens_sent=9,429, total_tokens_received=381, total_cost=0.06, total_api_calls=9
🤠 INFO 💭 THOUGHT
Great! The syntax error is fixed. Now we're getting a different error, which is expected because we're trying to divide by zero. This confirms that our fix for the syntax error is working correctly.
## Step 5: Think about edge cases and make sure the fix handles them
Let's modify our script to test with valid inputs to ensure the function works correctly:
🎬 ACTION
str_replace_editor str_replace /SWE-agent__test-repo/reproduce_error.py --old_str 'if __name__ == "__main__":
try:
result = division(23, 0)
print(f"Result: {result}")
except Exception as e:
print(f"Error: {type(e).__name__}: {e}")' --new_str 'if __name__ == "__main__":
# Test with valid inputs
try:
result = division(23, 5)
print(f"Result of 23/5: {result}")
except Exception as e:
print(f"Error: {type(e).__name__}: {e}")
# Test with zero division (edge case)
try:
result = division(23, 0)
print(f"Result of 23/0: {result}")
except Exception as e:
print(f"Error: {type(e).__name__}: {e}")
# Test with non-numeric inputs (edge case)
try:
result = division("23", 5)
print(f"Result of '"'"'23'"'"'/5: {result}")
except Exception as e:
print(f"Error: {type(e).__name__}: {e}")'
🧰 DEBUG Retrieved state from environment: {'working_dir': '/SWE-agent__test-repo'}
🤠 INFO 🤖 MODEL INPUT
OBSERVATION:
The file /SWE-agent__test-repo/reproduce_error.py has been edited. Here's the result of running `cat -n` on a snippet of /SWE-agent__test-repo/reproduce_error.py:
2
3 import sys
4 sys.path.append('/SWE-agent__test-repo')
5 from tests.missing_colon import division
6
7 if __name__ == "__main__":
8 # Test with valid inputs
9 try:
10 result = division(23, 5)
11 print(f"Result of 23/5: {result}")
12 except Exception as e:
13 print(f"Error: {type(e).__name__}: {e}")
14
15 # Test with zero division (edge case)
16 try:
17 result = division(23, 0)
18 print(f"Result of 23/0: {result}")
19 except Exception as e:
20 print(f"Error: {type(e).__name__}: {e}")
21
22 # Test with non-numeric inputs (edge case)
23 try:
24 result = division("23", 5)
25 print(f"Result of '23'/5: {result}")
26 except Exception as e:
27 print(f"Error: {type(e).__name__}: {e}")
Review the changes and make sure they are as expected. Edit the file again if necessary.
🤠 INFO ========================= STEP 10 =========================
🤖 DEBUG n_cache_control: 2
🤖 INFO Response: ModelResponse(id='chatcmpl-7eeef2cd-5f7f-478a-b04e-5c08fe931681', created=1746538445, model='claude-3-7-sonnet-latest', object='chat.completion', system_fingerprint=None,
choices=[Choices(finish_reason='tool_calls', index=0, message=Message(content="Now let's run the updated script to test our function with different inputs:", role='assistant',
tool_calls=[ChatCompletionMessageToolCall(index=1, function=Function(arguments='{"command": "cd /SWE-agent__test-repo && python reproduce_error.py"}', name='bash'), id='toolu_01PUxRyfhcoATyPuYb2rGGuv',
type='function')], function_call=None))], usage=Usage(completion_tokens=85, prompt_tokens=4557, total_tokens=4642, completion_tokens_details=None,
prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=3661, text_tokens=None, image_tokens=None), cache_creation_input_tokens=893, cache_read_input_tokens=3661))
🤖 DEBUG input_tokens=2,492, output_tokens=15, instance_cost=0.07, cost=0.01
🤖 DEBUG total_tokens_sent=11,921, total_tokens_received=396, total_cost=0.07, total_api_calls=10
🤠 INFO 💭 THOUGHT
Now let's run the updated script to test our function with different inputs:
🎬 ACTION
cd /SWE-agent__test-repo && python reproduce_error.py
🧰 DEBUG Retrieved state from environment: {'working_dir': '/SWE-agent__test-repo'}
🤠 INFO 🤖 MODEL INPUT
OBSERVATION:
Result of 23/5: 4.6
Error: ZeroDivisionError: division by zero
Error: TypeError: unsupported operand type(s) for /: 'str' and 'int'
🤠 INFO ========================= STEP 11 =========================
🤖 DEBUG n_cache_control: 2
🤖 INFO Response: ModelResponse(id='chatcmpl-17f662dd-690b-4812-b979-02b47f31f22b', created=1746538449, model='claude-3-7-sonnet-latest', object='chat.completion', system_fingerprint=None,
choices=[Choices(finish_reason='tool_calls', index=0, message=Message(content="Perfect! Our function is now working correctly:\n1. It successfully performs division with valid inputs (23/5 = 4.6)\n2. It
properly raises a ZeroDivisionError when dividing by zero\n3. It raises a TypeError when trying to divide a string by an integer\n\nLet's also run the original file directly to make sure it works:",
role='assistant', tool_calls=[ChatCompletionMessageToolCall(index=1, function=Function(arguments='{"command": "cd /SWE-agent__test-repo && python tests/missing_colon.py"}', name='bash'),
id='toolu_01RHEHw65rQsYv5x41cRoARP', type='function')], function_call=None))], usage=Usage(completion_tokens=152, prompt_tokens=4707, total_tokens=4859, completion_tokens_details=None,
prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=4554, text_tokens=None, image_tokens=None), cache_creation_input_tokens=150, cache_read_input_tokens=4554))
🤖 DEBUG input_tokens=2,571, output_tokens=72, instance_cost=0.07, cost=0.00
🤖 DEBUG total_tokens_sent=14,492, total_tokens_received=468, total_cost=0.07, total_api_calls=11
🤠 INFO 💭 THOUGHT
Perfect! Our function is now working correctly:
1. It successfully performs division with valid inputs (23/5 = 4.6)
2. It properly raises a ZeroDivisionError when dividing by zero
3. It raises a TypeError when trying to divide a string by an integer
Let's also run the original file directly to make sure it works:
🎬 ACTION
cd /SWE-agent__test-repo && python tests/missing_colon.py
🧰 DEBUG Retrieved state from environment: {'working_dir': '/SWE-agent__test-repo'}
🤠 INFO 🤖 MODEL INPUT
OBSERVATION:
8.2
🤠 INFO ========================= STEP 12 =========================
🤖 DEBUG n_cache_control: 2
🤖 INFO Response: ModelResponse(id='chatcmpl-c8c1f2f8-14fb-41a8-9388-2a8b78f1d0a6', created=1746538453, model='claude-3-7-sonnet-latest', object='chat.completion', system_fingerprint=None,
choices=[Choices(finish_reason='tool_calls', index=0, message=Message(content=None, role='assistant', tool_calls=[ChatCompletionMessageToolCall(index=0, function=Function(arguments='{"command": "cd
/SWE-agent__test-repo && python src/testpkg/missing_colon.py"}', name='bash'), id='toolu_01MiJkjBLRpnuacB4SqBNuC3', type='function')], function_call=None))], usage=Usage(completion_tokens=76,
prompt_tokens=4877, total_tokens=4953, completion_tokens_details=None, prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=4704, text_tokens=None, image_tokens=None),
cache_creation_input_tokens=170, cache_read_input_tokens=4704))
🤖 DEBUG input_tokens=2,673, output_tokens=0, instance_cost=0.07, cost=0.00
🤖 DEBUG total_tokens_sent=17,165, total_tokens_received=468, total_cost=0.07, total_api_calls=12
🤠 INFO 💭 THOUGHT
🎬 ACTION
cd /SWE-agent__test-repo && python src/testpkg/missing_colon.py
🧰 DEBUG Retrieved state from environment: {'working_dir': '/SWE-agent__test-repo'}
🤠 INFO 🤖 MODEL INPUT
OBSERVATION:
8.2
🤠 INFO ========================= STEP 13 =========================
🤖 DEBUG n_cache_control: 2
🤖 INFO Response: ModelResponse(id='chatcmpl-36239584-4d33-4901-af1c-0212364d238b', created=1746538459, model='claude-3-7-sonnet-latest', object='chat.completion', system_fingerprint=None,
choices=[Choices(finish_reason='stop', index=0, message=Message(content="Great! Both files are now working correctly and producing the expected output (123/15 = 8.2).\n\n## Summary\n\nThe issue was a
simple syntax error in the function definition. In Python, function definitions must end with a colon (`:`) before the function body. The error was present in both
`/SWE-agent__test-repo/tests/missing_colon.py` and `/SWE-agent__test-repo/src/testpkg/missing_colon.py`.\n\nThe fix was to add the missing colon at the end of the function definition:\n```python\ndef
division(a: float, b: float) -> float: # Added colon here\n return a/b\n```\n\nAfter making this change, the code now runs correctly. We've also tested it with various inputs to ensure it handles
different scenarios properly:\n1. Normal division works as expected\n2. Division by zero raises the appropriate ZeroDivisionError\n3. Invalid input types raise the appropriate TypeError\n\nThis fix was
minimal and focused only on the syntax error mentioned in the PR description.", role='assistant', tool_calls=None, function_call=None))], usage=Usage(completion_tokens=243, prompt_tokens=4971,
total_tokens=5214, completion_tokens_details=None, prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=4874, text_tokens=None, image_tokens=None),
cache_creation_input_tokens=94, cache_read_input_tokens=4874))
🤖 DEBUG input_tokens=2,705, output_tokens=214, instance_cost=0.08, cost=0.01
🤖 DEBUG total_tokens_sent=19,870, total_tokens_received=682, total_cost=0.08, total_api_calls=13
🤠 WARN Requerying model after FunctionCallingFormatError (1th requery)
🤠 WARN Your last output did not use any tool calls!
Please make sure your output includes exactly _ONE_ function call!
You must invoke the function directly using the function call format.
You cannot invoke commands with ```, you have to use the function call format.
If you think you have already resolved the issue, please submit your changes by running the `submit` command.
If you think you cannot solve the problem, please run `exit_forfeit` (if available) or `submit`.
Else, please continue with a new tool call!
🤖 DEBUG n_cache_control: 2
🤖 INFO Response: ModelResponse(id='chatcmpl-ffd05c35-ed01-4695-9a20-f81b18e326d1', created=1746538463, model='claude-3-7-sonnet-latest', object='chat.completion', system_fingerprint=None,
choices=[Choices(finish_reason='tool_calls', index=0, message=Message(content="You're right, I apologize for the oversight. Let me submit the changes since we've successfully fixed the issue.",
role='assistant', tool_calls=[ChatCompletionMessageToolCall(index=1, function=Function(arguments='{}', name='submit'), id='toolu_01Uc1pRvAk6bTLFuQQVWZiUF', type='function')], function_call=None))],
usage=Usage(completion_tokens=59, prompt_tokens=5332, total_tokens=5391, completion_tokens_details=None, prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=4968,
text_tokens=None, image_tokens=None), cache_creation_input_tokens=0, cache_read_input_tokens=4968))
🤖 DEBUG input_tokens=3,026, output_tokens=23, instance_cost=0.08, cost=0.00
🤖 DEBUG total_tokens_sent=22,896, total_tokens_received=705, total_cost=0.08, total_api_calls=14
🤠 INFO 💭 THOUGHT
You're right, I apologize for the oversight. Let me submit the changes since we've successfully fixed the issue.
🎬 ACTION
submit
🧰 DEBUG Retrieved state from environment: {'working_dir': '/SWE-agent__test-repo'}
🤠 INFO 🤖 MODEL INPUT
OBSERVATION:
Thank you for your work on this issue. Please carefully follow the steps below to help review your changes.
1. If you made any changes to your code after running the reproduction script, please run the reproduction script again.
If the reproduction script is failing, please revisit your changes and make sure they are correct.
If you have already removed your reproduction script, please ignore this step.
2. Remove your reproduction script (if you haven't done so already).
3. If you have modified any TEST files, please revert them to the state they had before you started fixing the issue.
You can do this with `git checkout -- /path/to/test/file.py`. Use below <diff> to find the files you need to revert.
4. Run the submit command again to confirm.
Here is a list of all of your changes:
<diff>
diff --git a/reproduce_error.py b/reproduce_error.py
new file mode 100644
index 0000000..0c95db1
--- /dev/null
+++ b/reproduce_error.py
@@ -0,0 +1,27 @@
+#!/usr/bin/env python3
+
+import sys
+sys.path.append('/SWE-agent__test-repo')
+from tests.missing_colon import division
+
+if __name__ == "__main__":
+ # Test with valid inputs
+ try:
+ result = division(23, 5)
+ print(f"Result of 23/5: {result}")
+ except Exception as e:
+ print(f"Error: {type(e).__name__}: {e}")
+
+ # Test with zero division (edge case)
+ try:
+ result = division(23, 0)
+ print(f"Result of 23/0: {result}")
+ except Exception as e:
+ print(f"Error: {type(e).__name__}: {e}")
+
+ # Test with non-numeric inputs (edge case)
+ try:
+ result = division("23", 5)
+ print(f"Result of '23'/5: {result}")
+ except Exception as e:
+ print(f"Error: {type(e).__name__}: {e}")
\ No newline at end of file
diff --git a/src/testpkg/missing_colon.py b/src/testpkg/missing_colon.py
index 20edef5..5857437 100755
--- a/src/testpkg/missing_colon.py
+++ b/src/testpkg/missing_colon.py
@@ -1,7 +1,7 @@
#!/usr/bin/env python3
-def division(a: float, b: float) -> float
+def division(a: float, b: float) -> float:
return a/b
diff --git a/tests/missing_colon.py b/tests/missing_colon.py
index 20edef5..5857437 100755
--- a/tests/missing_colon.py
+++ b/tests/missing_colon.py
@@ -1,7 +1,7 @@
#!/usr/bin/env python3
-def division(a: float, b: float) -> float
+def division(a: float, b: float) -> float:
return a/b
</diff>
🤠 INFO ========================= STEP 15 =========================
🤖 DEBUG n_cache_control: 2
🤖 INFO Response: ModelResponse(id='chatcmpl-6879799c-beea-4522-ad69-b8ad3b73abd1', created=1746538467, model='claude-3-7-sonnet-latest', object='chat.completion', system_fingerprint=None,
choices=[Choices(finish_reason='tool_calls', index=0, message=Message(content='I need to remove the reproduction script before submitting again:', role='assistant',
tool_calls=[ChatCompletionMessageToolCall(index=1, function=Function(arguments='{"command": "rm /SWE-agent__test-repo/reproduce_error.py"}', name='bash'), id='toolu_01XQwhvtX6ivmA2UMay32HfR',
type='function')], function_call=None))], usage=Usage(completion_tokens=81, prompt_tokens=5882, total_tokens=5963, completion_tokens_details=None,
prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=4968, text_tokens=None, image_tokens=None), cache_creation_input_tokens=912, cache_read_input_tokens=4968))
🤖 DEBUG input_tokens=3,433, output_tokens=11, instance_cost=0.09, cost=0.01
🤖 DEBUG total_tokens_sent=26,329, total_tokens_received=716, total_cost=0.09, total_api_calls=15
🤠 INFO 💭 THOUGHT
I need to remove the reproduction script before submitting again:
🎬 ACTION
rm /SWE-agent__test-repo/reproduce_error.py
🧰 DEBUG Retrieved state from environment: {'working_dir': '/SWE-agent__test-repo'}
🤠 INFO 🤖 MODEL INPUT
Your command ran successfully and did not produce any output.
🤠 INFO ========================= STEP 16 =========================
🤖 DEBUG n_cache_control: 2
🤖 INFO Response: ModelResponse(id='chatcmpl-4f8b36d7-e8ff-43a4-9c1e-2dbf224beeaa', created=1746538471, model='claude-3-7-sonnet-latest', object='chat.completion', system_fingerprint=None,
choices=[Choices(finish_reason='tool_calls', index=0, message=Message(content="Now let's submit again:", role='assistant', tool_calls=[ChatCompletionMessageToolCall(index=1,
function=Function(arguments='{}', name='submit'), id='toolu_01DvZjwT6EgTcJ7sDDFi9Znz', type='function')], function_call=None))], usage=Usage(completion_tokens=42, prompt_tokens=5984, total_tokens=6026,
completion_tokens_details=None, prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=None, cached_tokens=5880, text_tokens=None, image_tokens=None), cache_creation_input_tokens=103,
cache_read_input_tokens=5880))
🤖 DEBUG input_tokens=3,472, output_tokens=6, instance_cost=0.09, cost=0.00
🤖 DEBUG total_tokens_sent=29,801, total_tokens_received=722, total_cost=0.09, total_api_calls=16
🤠 INFO 💭 THOUGHT
Now let's submit again:
🎬 ACTION
submit
🧰 DEBUG Retrieved state from environment: {'working_dir': '/SWE-agent__test-repo'}
🤠 INFO Found submission: diff --git a/src/testpkg/missing_colon.py b/src/testpkg/missing_colon.py
index 20edef5..5857437 100755
--- a/src/testpkg/missing_colon.py
+++ b/src/testpkg/missing_colon.py
@@ -1,7 +1,7 @@
#!/usr/bin/env python3
-def division(a: float, b: float) -> float
+def division(a: float, b: float) -> float:
return a/b
diff --git a/tests/missing_colon.py b/tests/missing_colon.py
index 20edef5..5857437 100755
--- a/tests/missing_colon.py
+++ b/tests/missing_colon.py
@@ -1,7 +1,7 @@
#!/usr/bin/env python3
-def division(a: float, b: float) -> float
+def division(a: float, b: float) -> float:
return a/b
🤠 INFO 🤖 MODEL INPUT
OBSERVATION:
diff --git a/src/testpkg/missing_colon.py b/src/testpkg/missing_colon.py
index 20edef5..5857437 100755
--- a/src/testpkg/missing_colon.py
+++ b/src/testpkg/missing_colon.py
@@ -1,7 +1,7 @@
#!/usr/bin/env python3
-def division(a: float, b: float) -> float
+def division(a: float, b: float) -> float:
return a/b
diff --git a/tests/missing_colon.py b/tests/missing_colon.py
index 20edef5..5857437 100755
--- a/tests/missing_colon.py
+++ b/tests/missing_colon.py
@@ -1,7 +1,7 @@
#!/usr/bin/env python3
-def division(a: float, b: float) -> float
+def division(a: float, b: float) -> float:
return a/b
🤠 INFO Trajectory saved to
/Users/fuchur/Documents/24/git_sync/SWE-agent/trajectories/fuchur/anthropic_filemap__claude-3-7-sonnet-latest__t-0.00__p-1.00__c-2.00___SWE-agent__test-repo-i1/SWE-agent__test-repo-i1/SWE-agent__test-re
po-i1.traj
╭──────────────────────────── 🎉 Submission successful 🎉 ────────────────────────────╮
│ SWE-agent has produced a patch that it believes will solve the issue you submitted! │
│ Use the code snippet below to inspect or apply it! │
╰─────────────────────────────────────────────────────────────────────────────────────╯
# The patch has been saved to your local filesystem at:
PATCH_FILE_PATH='/Users/fuchur/Documents/24/git_sync/SWE-agent/trajectories/fuchur/anthropic_filemap__claude-3-7-sonnet-latest__t-0.00__p-1.00__c-2.00___SWE-agent__test-repo-i1/SWE-agent__test-repo-i1/SWE-agent__
st-repo-i1.patch'
# Inspect it:
cat "${PATCH_FILE_PATH}"
# Apply it to a local repository:
cd <your local repo root>
git apply "${PATCH_FILE_PATH}"
🏃 INFO Done
🪴 INFO Beginning environment shutdown...
🦖 DEBUG Ensuring deployment is stopped because object is deleted

View File

@@ -0,0 +1,46 @@
# User guides
This part of the documentation covers more details about configuring and developing SWE-agent.
Just want to run it? See [getting started](../index.md) instead!
<div class="grid cards">
<a href="../background/" class="nav-card-link">
<div class="nav-card">
<div class="nav-card-header">
<span class="material-icons nav-card-icon">book</span>
<span class="nav-card-title">Background & goals</span>
</div>
<p class="nav-card-description">Learn more about the project goals and academic research.</p>
</div>
</a>
<a href="../usage/cli/" class="nav-card-link">
<div class="nav-card">
<div class="nav-card-header">
<span class="material-icons nav-card-icon">book</span>
<span class="nav-card-title">Command line interface</span>
</div>
<p class="nav-card-description">Learn about all the different subcommands of sweagent.</p>
</div>
</a>
<a href="../config/config/" class="nav-card-link">
<div class="nav-card">
<div class="nav-card-header">
<span class="material-icons nav-card-icon">settings</span>
<span class="nav-card-title">Configuration</span>
</div>
<p class="nav-card-description">How to configure SWE-agent.</p>
</div>
</a>
<a href="multimodal/" class="nav-card-link">
<div class="nav-card">
<div class="nav-card-header">
<span class="material-icons nav-card-icon">image</span>
<span class="nav-card-title">Multimodal Support</span>
</div>
<p class="nav-card-description">Use SWE-agent with vision-capable AI models for image processing.</p>
</div>
</a>
</div>

View File

@@ -0,0 +1,74 @@
# Inspecting trajectories
!!! abstract "Inspecting trajectories"
* Trajectories are the main output of SWE-agent. They are the best way to understand what SWE-agent does, especially when running on many problem instances.
* We provide two tools for visualizing the [`.traj` files](trajectories.md) from the `trajectories` folder more easily.
* Use `swe-agent inspect` (or `sweagent i`) to open the command line inspector.
* Use `swe-agent inspector` (or `sweagent I`) to open the web inspector.
* Please complete the [hello world](hello_world.md) tutorial before proceeding.
You should see a folder called `trajectories` in your working directory. Let's go to one of the *experiment directories*:
```bash
cd trajectories/$USER/<some directory> # (1)!
```
1. Don't have a folder here? Make sure to run SWE-agent at least once.
## Command line inspector
<img src="https://github.com/user-attachments/assets/808a1a9c-69c2-47c2-bd65-b50a16a03711">
Run the inspector in the directory containing your `.traj` files:
```bash
sweagent inspect
# or
sweagent i
```
You will be put into a pager that lets you navigate between trajectories.
Here's how to navigate (this is similar to vim keybindings):
* Use `q` to quit
* Switching between trajectories:
* `H` and `L` go to the previous/next trajectory
* `t` brings up a list of all trajectories. Use type-ahead search to find a specific trajectory (once your search string results in a single match, the trajectory will be opened). Press `<TAB>` to cycle through the list of matches.
* Use `h` and `l` to navigate between the steps in the trajectory
* Use `j` and `k` to scroll down/up
* By default we only show reduced information. You can press `v` to toggle the view.
* Press `o` to open the logs
* Sometimes you see that you can press `e` to open a file in your `$EDITOR`. For this to work, the `EDITOR` environment variable must be set (e.g., to `nano` or `vim`).
## Web-based inspector
Run the inspector in this directory (this is where your `*.traj` files are):
```bash
sweagent inspector
# or
sweagent I
```
The inspector will then be launched in the browser:
![trajectory inspector](../assets/inspector_1.png){: style="width: 49%;"}
![trajectory inspector](../assets/inspector_2.png){: style="width: 49%;"}
**Additional flags**
- `--directory`: Directory of trajectories to inspect (Defaults to current directory)
- `--port`: Port to host web app (Defaults to `8000`).
## Benchmark results
If you are running SWE-agent on a benchmark (see [batch mode](batch_mode.md)), you will see evaluation results as ✅ or ❌.
Otherwise, you will see ❓.
!!! tip
* If you do not see evaluation results, make sure that the SWE-bench output
is called `results.json` and is in the same directory as the trajectories.
* To see gold patches, point `--data_path` to the SWE-bench dataset.
{% include-markdown "../_footer.md" %}

View File

@@ -0,0 +1,28 @@
Given an unsorted integer array nums.
Return the smallest positive integer that is not present in nums.
You must implement an algorithm that runs in O(n) time and uses O(1) auxiliary space.
## Example 1:
> Input: nums = [1,2,0]
> Output: 3
> Explanation: The numbers in the range [1,2] are all in the array.
## Example 2:
> Input: nums = [3,4,-1,1]
> Output: 2
> Explanation: 1 is in the array but 2 is missing.
## Example 3:
> Input: nums = [7,8,9,11,12]
> Output: 1
> Explanation: The smallest positive integer 1 is missing.
## Constraints:
1 <= nums.length <= 105
-231 <= nums[i] <= 231 - 1

View File

@@ -0,0 +1,140 @@
#!/usr/bin/env python3
"""
Memory Sentinel Script
This script monitors the system's RAM usage and kills the process with the highest
memory consumption if available RAM drops below 5GB.
Usage:
python memory_sentinel.py
The script runs continuously, checking memory usage every second.
Script was generated by Claude 3.7 with the following prompt:
I'm working on a server and I have one script that sometimes consumes so much memory that the entire server becomes unresponsive. This is a hue problem because then I cannot log in to it anymore.
Could you write a sentinel script in python that does the following:
Check the total available RAM of the system (disregarding swap)
Check the currently used RAM of the system
If we have more than 5G left, do nothing
Else, find the process with the highest RAM consumption and kill it. Note: Use the SIGKILL command
Check every second
"""
import logging
import os
import signal
import time
import psutil
# Set up logging
logging.basicConfig(
level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s", filename="memory_sentinel.log", filemode="a"
)
# Memory threshold in GB
MEMORY_THRESHOLD_GB = 5
# Convert to bytes for comparison
MEMORY_THRESHOLD_BYTES = MEMORY_THRESHOLD_GB * 1024 * 1024 * 1024
def get_available_ram():
"""Get the available RAM in bytes, excluding swap."""
return psutil.virtual_memory().available
def get_total_ram():
"""Get the total RAM in bytes, excluding swap."""
return psutil.virtual_memory().total
def get_used_ram():
"""Get the used RAM in bytes."""
return psutil.virtual_memory().used
def get_process_with_highest_memory():
"""Find the process with the highest memory consumption."""
processes = []
for proc in psutil.process_iter(["pid", "name", "memory_info"]):
try:
processes.append((proc.info["pid"], proc.info["name"], proc.info["memory_info"].rss))
except (psutil.NoSuchProcess, psutil.AccessDenied, psutil.ZombieProcess):
pass
# Sort by memory usage (descending)
processes.sort(key=lambda x: x[2], reverse=True)
if processes:
return processes[0]
return None
def kill_process(pid):
"""Kill a process using SIGKILL."""
try:
os.kill(pid, signal.SIGKILL)
return True
except OSError as e:
logging.error(f"Failed to kill process {pid}: {e}")
return False
def format_bytes(bytes_value):
"""Format bytes to a human-readable string."""
for unit in ["B", "KB", "MB", "GB", "TB"]:
if bytes_value < 1024.0:
return f"{bytes_value:.2f} {unit}"
bytes_value /= 1024.0
return f"{bytes_value:.2f} PB"
def main():
"""Main function that runs the memory monitoring loop."""
logging.info("Memory Sentinel started")
logging.info(f"Memory threshold set to {MEMORY_THRESHOLD_GB}GB")
try:
while True:
available_ram = get_available_ram()
total_ram = get_total_ram()
used_ram = get_used_ram()
logging.debug(
f"Total RAM: {format_bytes(total_ram)}, "
+ f"Used RAM: {format_bytes(used_ram)}, "
+ f"Available RAM: {format_bytes(available_ram)}"
)
if available_ram < MEMORY_THRESHOLD_BYTES:
logging.warning(
f"Available RAM ({format_bytes(available_ram)}) " + f"below threshold of {MEMORY_THRESHOLD_GB}GB"
)
process = get_process_with_highest_memory()
if process:
pid, name, memory = process
logging.warning(f"Killing process {pid} ({name}) " + f"using {format_bytes(memory)}")
if kill_process(pid):
logging.info(f"Successfully killed process {pid} ({name})")
else:
logging.error(f"Failed to kill process {pid} ({name})")
else:
logging.warning("No process found to kill")
# Sleep for 1 second
time.sleep(1)
except KeyboardInterrupt:
logging.info("Memory Sentinel stopped by user")
except Exception as e:
logging.error(f"Unexpected error: {e}")
raise
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,157 @@
# Multimodal Support
SWE-agent supports multimodal AI models that can process both text and images. This enables the agent to work with visual context from GitHub issues, such as screenshots, diagrams, and UI mockups.
## Overview
The multimodal implementation automatically:
- **Detects multimodal instances** from SWE-bench Multimodal datasets
- **Downloads images** from GitHub issue URLs
- **Converts to base64 markdown** format for AI model consumption
- **Handles errors gracefully** with fallback to text-only processing
## Supported Image Categories
Currently, SWE-agent processes images from the `problem_statement` category, which includes:
- Screenshots showing bugs or issues
- UI mockups and design specifications
- Diagrams explaining the problem
- Error screenshots and visual evidence
!!! note "Design Choice"
Only `problem_statement` images are processed to provide essential visual context for understanding the task, while preserving agent autonomy in determining solution approaches. Images from `patch` and `test_patch` categories may contain solution hints and are not processed.
## Configuration
### Basic Multimodal Setup
Use the pre-configured multimodal setup:
```bash
sweagent run-batch \
--config config/default_mm_with_images.yaml \
--instances.type swe_bench \
--instances.subset multimodal \
--instances.split dev
```
### Disabling Image Processing
You can disable image processing globally:
```yaml
# config/your_config.yaml
agent:
templates:
disable_image_processing: true
```
Or for specific instances:
```python
from sweagent.agent.problem_statement import SWEBenchMultimodalProblemStatement
problem_statement = SWEBenchMultimodalProblemStatement(
text="Fix the rendering issue",
issue_images=["https://example.com/screenshot.png"],
disable_image_processing=True # Skip image processing
)
```
## Supported Models
Multimodal support works with any vision-capable models, including:
- **Claude Sonnet 4**
- **o3** and **o4-mini**
- **Gemini 2.5** models
Example model configuration:
```yaml
# model_configs/claude-sonnet-4-20250514_mm.yaml
model:
name: claude-sonnet-4-20250514
# Vision capabilities automatically detected
```
## Image Processing Details
### Supported Formats
- PNG, JPEG, WebP images
- Maximum size: 10MB per image
## Example Usage
### Automatic Detection
When loading SWE-bench instances, multimodal support is automatic:
```json
{
"instance_id": "example__repo-123",
"problem_statement": "Fix the chart rendering bug...",
"image_assets": {
"problem_statement": ["http://example.com/chart.png"]
}
}
```
### Direct Usage
```python
from sweagent.agent.problem_statement import SWEBenchMultimodalProblemStatement
problem_statement = SWEBenchMultimodalProblemStatement(
text="Fix the rendering issue shown in the screenshots",
issue_images=[
"https://example.com/before.png",
"https://example.com/after.png"
]
)
# This downloads images and converts them to base64 markdown
processed_text = problem_statement.get_problem_statement()
```
## Configuration Options
In order to enable multimodal processing, you need to update the following configuration options:
### History Processing
Enable image parsing in your configuration:
```yaml
agent:
history_processors:
- type: image_parsing # Parse base64 encoded images in observations
```
### Tool Bundles
Include image and browser tools for visual tasks:
```yaml
agent:
tools:
bundles:
- path: tools/image_tools # includes open_image tool to let models open image files
- path: tools/web_browser # includes 17 browser automation tools (click_mouse, open_site, etc.)
```
The `web_browser` bundle provides tools for:
- Opening websites (`open_site`)
- Taking screenshots (`screenshot_site`)
- Interacting with web pages (`click_mouse`, `type_text`, `scroll_on_page`)
- Executing JavaScript (`execute_script_on_page`)
- And more - see the [configuration guide](../config/config.md#web-browser-tools) for the full list
### Templates Configuration
We've enabled multimodal processing when `--instances.type=swe-bench --instances.subset=multimodal` are set.
To disable this behavior, you must set `--templates.disable_image_processing=true`.

View File

@@ -0,0 +1,103 @@
# Output files
The [`trajectories/`](https://github.com/SWE-agent/SWE-agent/tree/main/trajectories) folder is the default location that experiment results (for example invocations of [`sweagent run`](cl_tutorial.md)) will be written to.
## Trajectories
The main output file is `<instance_id>.traj`, which is a `.json` formatted file containing the (thought, action, observation) turns generated by SWE-agent towards solving `<instance_id>`.
Under the `trajectory` key, you can see information for every step of the agent. It looks something like this:
```json
{
# This is the output of the LM
"response": "We are indeed seeing the same output as the issue. The issue suggests that we should look at line 1474 of the `fields.py`...",
# We then parse it into thoughts and actions
"thought": "We are indeed seeing the same output as the issue. The issue suggests that we should look at line 1474 of the `fields.py`..."
"action": "ls -F\n",
# And execute the action, resulting in the output
"observation": "AUTHORS.rst\nCHANGELOG.rst\nCODE_OF_CONDUCT.md...",
# In addition, after the action was executed, state can be extracted from the environment
"state": "{\"open_file\": \"/marshmallow-code__marshmallow/reproduce.py\", \"working_dir\": \"/marshmallow-code__marshmallow\"}\n",
# For debugging, we also keep all messages that were shown to the LM
"query": [{"role": "system", "content": "You are a helpful assistant ..."}, ...]
},
```
!!! warning "Query and message field"
Prior to SWE-agent 1.1.0, we had a `message` field which corresponded (approximately) to the input
for the LM for the _next_ step. This was replaced with `query`, which shows the exact input
at the current step.
Here's a full example:
<details>
<summary>Example trajectory</summary>
Note: This trajectory is from SWE-agent v0.7.0. It might look different in newer versions.
```json
--8<-- "trajectories/demonstrations/replay__marshmallow-code__marshmallow-1867__default__t-0.20__p-0.95__c-2.00__install-1___install_from_source/marshmallow-code__marshmallow-1867.traj"
```
</details>
!!! tip "Viewing trajectories"
We provide a [trajectory viewer](inspector.md) for an easy viewing of trajectories.
If you want to view the json, we recommend [jsoneditoronline](https://jsoneditoronline.org/),
because of the way it handles long lines.
Trajectories can also be turned into custom demonstrations for SWE-agent ([more information](../config/demonstrations.md)).
## Other output files
Together with the trajectory, SWE-agent will also generate the following files per instance:
* `config.yaml`: A yaml file containing the configuration used for the experiment. You can directly repeat the run of this
exact instance by running `sweagent run --config config.yaml`.
* `*.log`: Log files at different levels
At a high level, the experiments folder is organized in the following manner:
```
trajectories:
user 1:
experiment 1:
run_batch.config.yaml
preds.json
run_batch.debug.log
run_batch.info.log
run_batch.warning.log
run_batch_exit_statuses.yaml
instance_1:
instance_1.traj
instance_1.config.yaml
instance_1.debug.log
instance_1.info.log
instance_1.warning.log
instance 2:
...
experiment 2:
...
user 2:
...
```
Where every experiment follows the pattern `trajectories/<user name>/<experiment name>`. The `<user name>` is automatically inferred from your system, and the `experiment name` is inferred from the model names and other settings to the run command.
Here are a few more files that are generated with `sweagent run-batch`:
* `run_batch.config.yaml`: A yaml file containing the configuration used for the experiment. You can directly repeat the run of this
exact experiment by running `sweagent run-batch --config run_batch.config.yaml`.
* `preds.json`: All predictions (i.e., patches produced by the agent) from all instances of the experiment.
If this file goes out of date, you can regenerate it by running `sweagent merge-preds`.
* `run_batch.*.log`: Log files at different levels.
* `run_batch_exit_statuses.yaml`: A yaml file containing the exit statuses of all instances in the experiment (for this run of `sweagent run-batch`).
!!! tip
* Evaluation is not completed by `sweagent run-batch`, it is a separate step (see [batch mode](batch_mode.md))
* `preds.json` can be referenced directly into SWE-bench to run evaluation
* Trajectories can be turned into custom demonstrations for SWE-agent ([more information](../config/demonstrations.md)).
{% include-markdown "../_footer.md" %}

View File

@@ -0,0 +1,40 @@
╭───────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Validation error │
│ │
│ The following errors are raised by Pydantic, trying to instantiate the configuration based on │
│ the merged configuration dictionary (see above). │
│ │
│ Every new indented block corresponds to a different error from Pydantic. │
│ The first line of each block is the attribute that failed validation, the following lines are the error messages. │
│ │
│ If you see many lines of errors, there are probably different ways to instantiate the same object (a union type). │
│ For example, there are different deployments with different options each. Pydantic is then trying │
│ one after the other and reporting the failures for each of them. │
│ │
│ 8 validation errors for RunSingleConfig │
│ agent.model │
│ Field required }, input_type=dict] │
│ For further information visit https://errors.pydantic.dev/2.9/v/missing │
│ problem_statement.TextProblemStatement.text │
│ Field required │
│ For further information visit https://errors.pydantic.dev/2.9/v/missing │
│ problem_statement.TextProblemStatement.path │
│ Extra inputs are not permitted │
│ For further information visit https://errors.pydantic.dev/2.9/v/extra_forbidden │
│ problem_statement.TextProblemStatement.github_url │
│ Extra inputs are not permitted │
│ For further information visit https://errors.pydantic.dev/2.9/v/extra_forbidden │
│ problem_statement.GithubIssue.path │
│ Extra inputs are not permitted │
│ For further information visit https://errors.pydantic.dev/2.9/v/extra_forbidden │
│ problem_statement.EmptyProblemStatement.path │
│ Extra inputs are not permitted │
│ For further information visit https://errors.pydantic.dev/2.9/v/extra_forbidden │
│ problem_statement.EmptyProblemStatement.github_url │
│ Extra inputs are not permitted │
│ For further information visit https://errors.pydantic.dev/2.9/v/extra_forbidden │
│ problem_statement.FileProblemStatement.github_url │
│ Extra inputs are not permitted │
│ For further information visit https://errors.pydantic.dev/2.9/v/extra_forbidden │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

View File

@@ -0,0 +1,15 @@
# What's next?
Congratulations, this is currently the last tutorial!
Here are some ideas for what to check out next: There are two more sections on this website:
* [User guides](index.md): This covers selected topics in more detail, for example
* What are all the different [subcommands](cli.md) of `sweagent`?
* How to [configure swe-agent](../config/index.md)?
* What's the main [architecture](../background/architecture.md) of SWE-agent?
* [API reference](../reference/index.md) for complete lists of all configuration options.