Releases
v1.0.0
KubeRay is officially in General Availability!
Bump the CRD version from v1alpha1 to v1.
Relocate almost all documentation to the Ray website.
Improve RayJob UX.
Improve GCS fault tolerance.
GCS fault tolerance
CRD versioning
RayService
[Hotfix][Bug] Avoid unnecessary zero-downtime upgrade (#1581 , @kevin85421 )
[Feature] Add an example for RayService high availability (#1566 , @kevin85421 )
[Feature] Add a flag to make zero downtime upgrades optional (#1564 , @kevin85421 )
[Bug][RayService] KubeRay does not recreate Serve applications if a head Pod without GCS FT recovers from a failure. (#1420 , @kevin85421 )
[Bug] Fix the filename of text summarizer YAML (#1415 , @kevin85421 )
[serve] Change text ml yaml to use french in user config (#1403 , @zcin )
[services] Add text ml rayservice yaml (#1402 , @zcin )
[Bug] Fix flakiness of RayService e2e tests (#1385 , @kevin85421 )
Add RayService sample test (#1377 , @Darren221 )
[RayService] Revisit the conditions under which a RayService is considered unhealthy and the default threshold (#1293 , @kevin85421 )
[RayService][Observability] Add more loggings about networking issues (#1282 , @kevin85421 )
RayJob
[Feature] Improve observability for flaky RayJob test (#1587 , @kevin85421 )
[Bug][RayJob] Fix FailedToGetJobStatus by allowing transition to Running (#1583 , @architkulkarni )
[RayJob] Fix RayJob status reconciliation (#1539 , @astefanutti )
[RayJob]: Always use target RayCluster image as default RayJob submitter image (#1548 , @astefanutti )
[RayJob] Add default CPU and memory for job submitter pod (#1319 , @architkulkarni )
[Bug][RayJob] Check dashboard readiness before creating job pod (#1381 ) (#1429 , @rueian )
[Feature][RayJob] Use RayContainerIndex instead of 0 (#1397 ) (#1427 , @rueian )
[RayJob] Enable job log streaming by setting PYTHONUNBUFFERED
in job container (#1375 , @architkulkarni )
Add field to expose entrypoint num cpus in rayjob (#1359 , @shubhscoder )
[RayJob] Add runtime env YAML field (#1338 , @architkulkarni )
[Bug][RayJob] RayJob with custom head service name (#1332 , @kevin85421 )
[RayJob] Add e2e sample yaml test for shutdownAfterJobFinishes (#1269 , @architkulkarni )
RayCluster
[Enhancement] Remove unused variables in constant.go (#1474 , @evalaiyc98 )
[Enhancement] GPU RayCluster doesn't work on GKE Autopilot (#1470 , @kevin85421 )
[Refactor] Parameterize TestGetAndCheckServeStatus (#1450 , @evalaiyc98 )
[Feature] Make replicas optional for WorkerGroupSpec (#1443 , @kevin85421 )
use raycluster app's name as podgroup name key word (#1446 , @lowang-bh )
[Refactor] Make port name variables consistent and meaningful (#1389 , @evalaiyc98 )
[Feature] Use image of Ray head container as the default Ray Autoscaler container (#1401 , @kevin85421 )
Update Autoscaler YAML for the Autoscaler tutorial (#1400 , @kevin85421 )
[Feature] Ray container must be the first application container (#1379 , @kevin85421 )
[release blocker][Feature] Only Autoscaler can make decisions to delete Pods (#1253 , @kevin85421 )
[release blocker][Autoscaler] Randomly delete Pods when scaling down the cluster (#1251 , @kevin85421 )
Helm charts
KubeRay API Server
Added Python API server client (#1561 , @blublinsky )
updating url use v1 (#1577 , @blublinsky )
Fixed processing of job submitter (#1562 , @blublinsky )
extended job APIs (#1537 , @blublinsky )
fixed volumes test in cluster test (#1498 , @blublinsky )
Add documentation for API Server monitoring (#1479 , @blublinsky )
created HA example for API server (#1461 , @blublinsky )
Numerous fixes to the API server to make RayJob APIs working (#1447 , @blublinsky )
Updated API server documentation (#1435 , @z103cb )
servev2 support for API server (#1419 , @blublinsky )
replacement for #1312 (#1409 , @blublinsky )
Updates to the apiserver swagger-ui (#1410 , @z103cb )
implemented liveness/readyness probe for the API server (#1369 , @blublinsky )
Operator support for openShift (#1371 , @blublinsky )
Removed use of the of BUILD_FLAGS in apiserver makefile (#1336 , @z103cb )
Api server makefile (#1301 , @z103cb )
Documentation
[Doc] Update release docs (#1621 , @kevin85421 )
[Doc] Fix release doc format (#1578 , @kevin85421 )
Update kuberay mcad integration doc (#1373 , @tedhtchang )
[Release][Doc] Add instructions to release Go modules. (#1546 , @kevin85421 )
[Post v1.0.0-rc.1] Reenable sample YAML tests for latest release and update some docs (#1544 , @kevin85421 )
Update operator development instruction (#1458 , @tedhtchang )
doc: fix moved link (#1462 , @hongchaodeng )
Fix mkDocs (#1448 , @kevin85421 )
Update Kuberay doc to version 1.0.0 rc.0 (#1441 , @Yicheng-Lu-llll )
[Doc] Delete unused docs (#1440 , @kevin85421 )
[Post Ray 2.7.0 Release] Update Ray versions to Ray 2.7.0 (#1423 , @GeneDer )
[Doc] Update README (#1433 , @kevin85421 )
[release] Redirect users to Ray website (#1431 , @kevin85421 )
[Docs] Update Security Guidance on Dashboard Ingress (#1413 , @ijrsvt )
Update Volcano integration doc (#1380 , @annajung )
[Doc] Add gke bucket yaml (#1372 , @architkulkarni )
[RayJob] [Doc] Add real-world Ray Job use case tutorial for KubeRay (#1361 , @architkulkarni )
Delete ray_v1alpha1_rayjob.batch-inference.yaml (#1360 , @architkulkarni )
Documentation and example for running simple NLP service on kuberay (#1340 , @gvspraveen )
Add a document for profiling (#1299 , @Yicheng-Lu-llll )
Fix: Typo (#1295 , @ArgonQQ )
[Post release v0.6.0] Update CHANGELOG.md (#1274 , @kevin85421 )
Release v0.6.0 doc validation (#1271 , @kevin85421 )
[Doc] Develop Ray Serve Python script on KubeRay (#1250 , @kevin85421 )
[Doc] Fix the order of comments in sample Job YAML file (#1242 , @architkulkarni )
[Doc] Upload a screenshot for the Serve page in Ray dashboard (#1236 , @kevin85421 )
Fix typo (#1241 , @mmourafiq )
CI
[Bug] Fix flaky sample YAML tests (#1590 , @kevin85421 )
Allow to install and remove operator via scripts (#1545 , @jiripetrlik )
[CI] Create release tag for ray-operator Go module (#1574 , @astefanutti )
[Test][Bug] Update worker replias idempotently in rayjob autoscaler envtest (#1471 ) (#1543 , @rueian )
Update Dockerfiles to address CVE-2023-44487 (HTTP/2 Rapid Reset) (#1540 , @astefanutti )
[CI] Skip redis raycluster sample YAML test (#1465 , @architkulkarni )
Revert "[CI] Skip redis raycluster sample YAML test" (#1490 , @rueian )
Remove GOARCH in ray-operator/Dockfile to support multi-arch images (#1442 , @ideal )
Update Dockerfile to address closed CVEs (#1488 , @anishasthana )
[CI] Update latest release to v1.0.0-rc.0 in tests (#1467 , @architkulkarni )
[CI] Reenable rayjob sample yaml latest test (#1464 , @architkulkarni )
[CI] Skip redis raycluster sample YAML test (#1465 , @architkulkarni )
Updating logrus and net packages in go.mod (#1495 , @jbusche )
Allow E2E tests to run with arbitrary k8s cluster (#1306 , @jiripetrlik )
Bump golang.org/x/net from 0.0.0-20210405180319-a5a99cb37ef4 to 0.7.0 in /proto (#1345 , @dependabot [bot])
Bump golang.org/x/text from 0.3.5 to 0.3.8 in /proto (#1344 , @dependabot [bot])
Bump go.mongodb.org/mongo-driver from 1.3.4 to 1.5.1 in /apiserver (#1407 , @dependabot [bot])
Bump golang.org/x/sys from 0.0.0-20210510120138-977fb7262007 to 0.1.0 in /proto (#1346 , @dependabot [bot])
Bump golang.org/x/net from 0.0.0-20210813160813-60bc85c4be6d to 0.7.0 in /cli (#1405 , @dependabot [bot])
Bump github.com/emicklei/go-restful from 2.9.5+incompatible to 2.16.0+incompatible in /ray-operator (#1348 , @dependabot [bot])
Bump golang.org/x/sys from 0.0.0-20211210111614-af8b64212486 to 0.1.0 in /cli (#1347 , @dependabot [bot])
[CI] Remove RayService tests from comopatibility-test.py (#1395 , @kevin85421 )
[CI] Remove extraPortMappings from kind configurations (#1366 , @kevin85421 )
[CI] Update latest ray version 2.5.0 -> 2.6.3 (#1320 , @architkulkarni )
Bump the golangci-lint version in the api server makefile (#1342 , @z103cb )
[CI] Refactor pipeline and test RayCluster sample yamls (#1321 , @architkulkarni )
Update doc and base image for Go 1.19 (#1330 , @tedhtchang )
Fix release actions (#1323 , @anishasthana )
Upgrade to Go 1.19 (#1325 , @kevin85421 )
[CI] Run sample job YAML tests in buildkite (#1315 , @architkulkarni )
[CI] Downgrade kind
from to v0.20.0
to v0.11.1
(#1313 , @architkulkarni )
[CI] Publish KubeRay operator / apiserver images to Quay (#1307 , @kevin85421 )
[CI] Install kuberay operator in buildkite test (#1308 , @architkulkarni )
[CI] Verify kubectl in kind-in-docker step (#1305 , @architkulkarni )
[Quay] Sanity check for KubeRay repository setup (#1300 , @kevin85421 )
[CI] Only run test_ray_serve for Ray 2.6.0 and later (#1288 , @kevin85421 )
Update ray operator Dockerfile (#1213 , @anishasthana )
[Golang] Remove go get
(#1283 , @ijrsvt )
Dependencies: Upgrade golang.org/x packages (#1281 , @ijrsvt )
[CI] Add kind
-in-Docker test to Buildkite CI (#1243 , @architkulkarni )
Others
You canβt perform that action at this time.