Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Adding RestrictedFeatures Support to the Python Frontend Bindings #7775

Open
wants to merge 3,533 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
3533 commits
Select commit Hold shift + click to select a range
aff4b93
HTTP live connections on server shutdown (#6986)
kthui Apr 9, 2024
10f1c8d
Enable autodocs for python client library API documentation (#7082)
tanmayv25 Apr 9, 2024
5e20ef6
Updated vllm version (#7095)
oandreeva-nv Apr 10, 2024
52f97b5
Disable Dynamic Log File (#7092)
yinggeh Apr 11, 2024
159b060
Validate system shared memory region size when registering a region (…
rmccorm4 Apr 11, 2024
196caf0
Decoupled Async Execute (#7062)
kthui Apr 11, 2024
5b739db
Add trace mode and trace config entries in trace settings API (#7050)
indrajit96 Apr 11, 2024
0a4c87b
Update 'main' to track development of 2.46.0 / 24.05 (#7105)
mc-nv Apr 11, 2024
3b6c6f9
Validate the memory requested for the infer request is not out of bou…
jbkyang-nvi Apr 12, 2024
b889687
Add copyright for tritonclient_api (#7109)
Tabrizian Apr 12, 2024
7529f0e
Disable dynamic trace file (#7106)
yinggeh Apr 13, 2024
e116a2a
Update L0_logging to reflect error when trying to update log_file (#7…
yinggeh Apr 13, 2024
8e88f2c
Add new cached channel test (#7123)
jbkyang-nvi Apr 17, 2024
e965287
Fix gRPC frontend race condition (#7110)
kthui Apr 17, 2024
233c4b2
Remove client testing of server trace to match discontinued support f…
matthewkotila Apr 17, 2024
2de09ee
Re-enable PA trace testing but remove setting trace file (#7131)
matthewkotila Apr 19, 2024
dba31c2
Fix windows build for shared memory bound checking(#7137)
jbkyang-nvi Apr 19, 2024
09b34be
Fix test for cached channels (#7130)
jbkyang-nvi Apr 19, 2024
1da454c
Use a lower concurrency with more repetition for L0_memory_growth (#7…
krishung5 Apr 23, 2024
f243276
Replace deprecated tritongrpcclient package (#7061)
Tabrizian Apr 24, 2024
365b86a
Avoid the HTTP Error 403: rate limit exceeded error (#7155)
krishung5 Apr 25, 2024
987deaa
Clarify instance group documentation for ensemble (#7162)
Tabrizian Apr 25, 2024
d432266
Add extra footer to documentation (#7163)
mc-nv Apr 26, 2024
5239ff0
Add metrics model namespacing label test (#7141)
kthui Apr 26, 2024
16e5470
Update `main` post-24.04 (#7160)
mc-nv Apr 30, 2024
3c99c95
Remove meetup note now that the event has completed (#7179)
Tabrizian May 3, 2024
a9d3dac
Validate CUDA SHM region registration size (#7178)
krishung5 May 7, 2024
ee6d238
Fix python client Shm Leak (#7172)
fpetrini15 May 7, 2024
c724193
Add test for sequence state after cancellation (#7167)
kthui May 7, 2024
27c2142
Rename triton_tensorrtllm_worker -> trtllmExecutorWorker (#7194)
krishung5 May 8, 2024
884ca4e
Tests for Top Level Request Caching for Ensemble Models (#7074)
lkomali May 9, 2024
6694b74
Test cuda shared memory offset and byte size out of bounds(#7202)
jbkyang-nvi May 10, 2024
dd71d3b
Upgrade the golang version to 1.22.3 (#7208)
tanmayv25 May 13, 2024
a669145
Update 'Dockerfile' Python path to include DALI (#7216)
mc-nv May 14, 2024
4dcda7f
Remove the dependency on CUDA driver (#7224)
krishung5 May 15, 2024
d6fe6e6
Multiple Model Configurations (#7185)
yinggeh May 16, 2024
d356d6e
Fix L0_backend_python iGPU PyTorch installation (#7231)
kthui May 16, 2024
747f5d4
Fix the L0_simple_go_client (#7239)
tanmayv25 May 17, 2024
0370485
Add section on ensemble model caching (#7234)
rmccorm4 May 18, 2024
3e97828
Add testing for escaped log messages
nnshah1 May 20, 2024
9faf444
updating log parsing in test
nnshah1 May 21, 2024
620f095
Add documentation on logging formats
nnshah1 May 21, 2024
0c4228c
Return an error if --load-model is specified without explicit model c…
rmccorm4 May 22, 2024
1322225
Exclude Jax example from Python 3.8 (#7260)
krishung5 May 23, 2024
2d2c0b5
add test for shape validation (#7195)
jbkyang-nvi May 24, 2024
9cfc53a
Enhance OTEL testing to capture and verify Cancellation Requests and …
indrajit96 May 24, 2024
60a06bf
Fix Python 3.11 env (#7274)
krishung5 May 28, 2024
729b677
Bump vllm to v0.4.2 (#7198)
kebe7jun May 29, 2024
ea095c9
Update main to track development for 2.47.0 / r24.06 (#7291)
tanmayv25 May 29, 2024
20f3487
Update 'main' post 24.05 release (#7298)
tanmayv25 May 29, 2024
c907231
Update openvino to 2024.0.0 (#7299)
krishung5 May 30, 2024
c3eb5ca
docs: Update PR templates (#7290)
jbkyang-nvi May 30, 2024
13f819b
docs: Add default template that diverts to sub templates (#7306)
jbkyang-nvi May 30, 2024
d189a87
Added new flag for GPU peer access API control (#7261)
indrajit96 Jun 3, 2024
4d113dc
build: Update vllm version to v0.4.3 (latest) (#7309)
oandreeva-nv Jun 3, 2024
6a303f8
fix: Fix L0_input_validation--base (#7304)
yinggeh Jun 4, 2024
34390d7
fix: Remove onnxruntime libraries from system path (#7323)
tanmayv25 Jun 5, 2024
b0ea306
Change TensorRT-LLM (#7143)
mc-nv Jun 5, 2024
b6734dd
Add testing for libtorch cudnn (#7286)
Tabrizian Jun 5, 2024
31f00b6
Fix gRPC streaming non-decoupled segfault if sending response and fin…
kthui Jun 6, 2024
497475e
Add support for response sender in the default mode (#7311)
kthui Jun 6, 2024
8ce3890
fix: Handling grpc cancellation edge-case:: Cancelling at step START …
oandreeva-nv Jun 6, 2024
8745160
test: Add testing for CUDA EP options (#7328)
krishung5 Jun 6, 2024
03ca720
ci: Support BF16 data type in TensorRT backend (#7310)
pskiran1 Jun 7, 2024
c0e4c81
test: Update error messages to comply with core change (#7326)
yinggeh Jun 7, 2024
7236796
ci: Restrict numpy to version 1.x (#7327)
KrishnanPrash Jun 7, 2024
3135eb5
test: Fix the test to expect updated error messages (#7340)
tanmayv25 Jun 12, 2024
fe63eba
test: Python models filtering outputs based on requested outputs (#7338)
kthui Jun 12, 2024
5f8497f
test: Add test for sequence flags in ensemble streaming inference (#7…
indrajit96 Jun 12, 2024
fd1d9c4
fix: Fix version for setuptools and grpcio-tools. Remove cudnn 8 inst…
krishung5 Jun 18, 2024
f326993
ci: Add INT64 Datatype Support for Shape Tensors in TensorRT Backend …
pskiran1 Jun 20, 2024
9e55dab
Update 15-container-copyright.txt (#7375)
Tabrizian Jun 26, 2024
0f4c9d3
Update `main` post -24.06 (#7380)
mc-nv Jun 28, 2024
686cf1a
test: Add input byte size tests using C APIs (#7372)
yinggeh Jul 3, 2024
33d7e7e
[refactor]: Refactor Frontend Trace OpenTelemetry Implementation (#7390)
oandreeva-nv Jul 5, 2024
65a9140
[fix]: grpc state cleanup fix (#7409)
oandreeva-nv Jul 5, 2024
4415430
[build]: vllm version update (#7405)
oandreeva-nv Jul 5, 2024
8c5b94c
[feat]:Custom Backend Tracing (#7403)
oandreeva-nv Jul 5, 2024
66e4fff
build: Reduce intermediate layers (#7408)
krishung5 Jul 8, 2024
e9b811c
test: Remove AWS bucket on test failure (#7342)
kthui Jul 8, 2024
dabb7cb
fix: Fix error message for L0_trt_compat (#7432)
krishung5 Jul 10, 2024
2f299d1
feat: Support for request id field in generate API (#7392)
shreyas-samsung Jul 10, 2024
22d9261
perf: Improve response throughput of a single gRPC stream (#7404)
kthui Jul 12, 2024
b263bfc
test: Tests for Metrics API enhancement to include error counters (#7…
indrajit96 Jul 12, 2024
3421429
Update NGC versions post-24.07 release (#7469)
pvijayakrish Jul 25, 2024
96ef8a7
[build]: Bumping vllm version to v0.5.3.post1 (#7453)
oandreeva-nv Jul 25, 2024
f151f8a
ci: Fix shape and reformat free tensor handling in the input byte siz…
pskiran1 Jul 27, 2024
b8a3629
chore: PA Migration From Client (#7449)
fpetrini15 Jul 29, 2024
5e61a01
test: Refactor cpu metrics tests to make L0_metrics more stable (#7476)
rmccorm4 Jul 29, 2024
e713208
test: Add BF16 test for python backend (#7483)
rmccorm4 Jul 30, 2024
3443dd6
test: Improve L0_logging stability (#7486)
rmccorm4 Jul 31, 2024
839faf7
ci: Return custom exit code to indicate known shm leak failure in L0_…
krishung5 Jul 31, 2024
d4b585d
Including 'tritonserver.lib' into final package (#7491)
mc-nv Aug 2, 2024
327ee02
build: Add default value for argument 'TRITON_REPO_ORGANIZATION' from…
zhanga5 Aug 5, 2024
5b33a25
chore:Purge PA from Client Repo (#7488)
fpetrini15 Aug 6, 2024
04e0d85
PA Migration: Update L0_client_build_variants (#7505)
fpetrini15 Aug 7, 2024
3c7263f
test: Add test for sending response after sending complete final flag…
kthui Aug 7, 2024
ea3ebca
Add vLLM x Triton user meetup announcement (#7509)
harryskim Aug 8, 2024
a5ad309
Fix benchmarking tests (#7461)
pskiran1 Aug 10, 2024
61466d4
feat: Add vLLM counter metrics access through Triton (#7493)
yinggeh Aug 16, 2024
cadd112
build: RHEL 8 Compatibility (#7519)
nv-kmcgill53 Aug 16, 2024
5611ca1
feat: Add GRPC error codes to GRPC streaming if enabled by user. (#7499)
indrajit96 Aug 16, 2024
6857dc3
test: Add python backend tests for the new histogram metric (#7540)
yinggeh Aug 17, 2024
c91d1e5
test: Load new model version should not reload loaded existing model …
kthui Aug 20, 2024
a7a43a2
Intermittent `L0_decoupled_grpc_error` crash fixed. (#7552)
indrajit96 Aug 20, 2024
3735d99
ci: Raise Documentation Generation Errors (#7559)
fpetrini15 Aug 22, 2024
8e56e30
docs: Add tensorrtllm_backend into doc generation (#7563)
krishung5 Aug 23, 2024
be1a0a5
build: RHEL8 EA2 Backends (#7568)
fpetrini15 Aug 27, 2024
ef6afcd
Release: Update NGC versions post-24.08 release (#7565)
pvijayakrish Aug 27, 2024
c88aec5
docs: Add python backend to windows build command (#7572)
krishung5 Aug 27, 2024
3ea493f
docs: Triton TRT-LLM user guide (#7529)
krishung5 Aug 27, 2024
01438d8
Build: Updating to allow passing DOCKER_GPU_ARGS at model generation …
pvijayakrish Aug 27, 2024
5104900
feat: Python Deployment of Triton Inference Server (#7501)
KrishnanPrash Aug 30, 2024
89a9038
fix: Adding copyright info (#7591)
KrishnanPrash Sep 3, 2024
cb1204d
test: Refactor core input size checks (#7592)
yinggeh Sep 4, 2024
8da14cc
Don't Build `tritonfrontend` for Windows. (#7599)
fpetrini15 Sep 7, 2024
9076d2c
fix: Add reference count tracking for shared memory regions (#7567)
pskiran1 Sep 11, 2024
3eab666
build/test: RHEL8 EA3 (#7595)
fpetrini15 Sep 11, 2024
e452b58
Fix: Add mutex lock for state completion check in gRPC streaming to p…
pskiran1 Sep 17, 2024
a93de16
Update fetch_models.sh (#7621)
vd-nv Sep 19, 2024
b4525aa
ci: Set stability factor to a higher value (#7634)
lkomali Sep 20, 2024
e44cf29
[docs] Removed vLLM meetup announcement (#7673)
oandreeva-nv Oct 1, 2024
fe0e41e
Update the versions post 24.09 release.
pvijayakrish Sep 25, 2024
c2fa60c
Build: Update triton version in Map (#7610)
pvijayakrish Sep 11, 2024
26a05ed
Update versions post 24.09
fpetrini15 Sep 7, 2024
b0adf31
Dockerfile.win10.min - Update dependency versions (#7633)
mc-nv Sep 24, 2024
86dbef3
Update server versions post 24.09
pvijayakrish Sep 26, 2024
1fa799e
ci: Reducing flakiness of `L0_python_api` (#7674)
KrishnanPrash Oct 2, 2024
3a21f61
[doc]Adjusted formatting of the warning (#7675)
oandreeva-nv Oct 3, 2024
1df30ed
fix: usage of ReadDataFromJson in array tensors (#7624)
v-shobhit Oct 7, 2024
9bbee48
fix: `tritonfrontend` gRPC Streaming Segmentation Fault (#7671)
KrishnanPrash Oct 7, 2024
71a285a
test: Enhance Python gRPC streaming test to send multiple requests (#…
kthui Oct 7, 2024
d6488fd
refactor: Removing `Server` subclass from `tritonfrontend` (#7683)
KrishnanPrash Oct 8, 2024
fb430c7
feat: Add copyright hook (#7666)
pranavm-nvidia Oct 8, 2024
d13235c
build: Adding `tritonfrontend` to `build.py` (#7681)
KrishnanPrash Oct 9, 2024
466fed4
feat: OpenAI Compatible Frontend (#7561)
rmccorm4 Oct 11, 2024
f9ca1b8
docs: Add beta note to OpenAI compatible API (#7695)
rmccorm4 Oct 12, 2024
c730982
fix: Fix bug when targeting the TRT-LLM backend ensemble (#7700)
blongnv Oct 16, 2024
0200d2c
test: Allow ensemble to create the final response even if some of the…
kthui Oct 16, 2024
1a54d83
test: Update server repo for some tests (#7704)
jbkyang-nvi Oct 16, 2024
2961cf8
docs: Add example outputs to OpenAI Frontend docs (#7691)
KrishnanPrash Oct 16, 2024
01e77a8
chore: Fix genai-perf command and add missing copyrights (#7710)
rmccorm4 Oct 16, 2024
aeb20a1
docs: Clarify meanings of ensemble key and value (#7711)
kthui Oct 17, 2024
dedb9e7
fix: Re-enables copyright hook, updates GitHub Action to only run pre…
pranavm-nvidia Oct 18, 2024
940aa22
fix: Fix L0_perf_nomodel shared memory (#7709)
kthui Oct 18, 2024
6f6cbe0
Change compute capablity min value (#7708)
mc-nv Oct 18, 2024
aa93b95
build: `tritonfrontend` support for no/partial endpoint builds (#7605)
KrishnanPrash Oct 18, 2024
50bfc50
reaching _populate_restricted_features helper function
KrishnanPrash Oct 19, 2024
fc54539
working rough draft for C++ implementation
KrishnanPrash Oct 19, 2024
47cac90
removing hardcoding
KrishnanPrash Oct 19, 2024
ee198de
Revert "Change compute capablity min value (#7708)" (#7721)
mc-nv Oct 21, 2024
12b1968
test: Test and document histogram latency metrics (#7694)
yinggeh Oct 23, 2024
10d7eaa
fix: Copy models out of NFS before starting Triton to avoid intermitt…
rmccorm4 Oct 23, 2024
2f8de73
docs: Add support matrix for model parallelism in OpenAI Frontend (#7…
rmccorm4 Oct 23, 2024
dcfc6a0
test: Add L0_additional_dependency_dirs (#7707)
fpetrini15 Oct 23, 2024
128f19a
test: Add small delay to L0_lifecycle test_load_new_model_version aft…
kthui Oct 24, 2024
604b2aa
Removing caching on windows. (#7717)
mc-nv Oct 29, 2024
4453fa3
feat: Metrics Support in `tritonfrontend` (#7703)
KrishnanPrash Oct 31, 2024
97b366c
updating with changes from main
KrishnanPrash Oct 31, 2024
fad2723
Cleaning up includes
KrishnanPrash Oct 31, 2024
284e71d
build: RHEL8 Python Backend (#7744)
fpetrini15 Oct 31, 2024
7d84906
RestrictedFeature Protocols Enum
KrishnanPrash Oct 31, 2024
e2011d5
Incomplete restricted features class
KrishnanPrash Oct 31, 2024
3bfacf8
chore: ensure proper clean up in shared memory related tests (#7729)
GuanLuo Oct 31, 2024
c7589f1
refactor: Include job id and nightly tag to results uploaded (#7751)
kthui Oct 31, 2024
0b724f2
Update test script for TRT compatibility test to check for
pvijayakrish Oct 27, 2024
3b4fabd
Build: Update main branch post 24.10 release (#7754)
pvijayakrish Nov 1, 2024
67c59c8
ci: Adding tests for `numpy>=2` (#7756)
KrishnanPrash Nov 1, 2024
06b358a
Reapply "Change compute capability min value (#7708)" (#7757)
mc-nv Nov 1, 2024
d260877
Adding '.pyi' support to copyright hook
KrishnanPrash Nov 1, 2024
326b6db
User workflow #1
KrishnanPrash Nov 4, 2024
8941e15
build: Install tritonfrontend and tritonserver wheels by default in p…
KrishnanPrash Nov 4, 2024
a111b93
Working RestrictedFeatures Class
KrishnanPrash Nov 4, 2024
85c795e
Spacing
KrishnanPrash Nov 4, 2024
1f7a516
Fix model generation (#7764)
mc-nv Nov 4, 2024
0ee3952
Making handle_triton_error compatible with non-void funcs
KrishnanPrash Nov 4, 2024
1007652
Working Python Workflow
KrishnanPrash Nov 5, 2024
6191c67
test: Test per-model metric customization and document custom histogr…
yinggeh Nov 6, 2024
524e3a0
Fixed includes and added triton-common-json dependency
KrishnanPrash Nov 6, 2024
4725600
fix: Fixing pip installation as a system package (#7768)
KrishnanPrash Nov 6, 2024
5f8f07b
fix: Adding copyright support for `.pyi` files (#7769)
KrishnanPrash Nov 6, 2024
715599d
Working passing of restricted_features json to C++ and parsing the json
KrishnanPrash Nov 7, 2024
0269a3c
fix: Skip copyrights check for "expected" files in L0_model_config (#…
yinggeh Nov 7, 2024
8aeefc6
Working Json parsing on C++ side
KrishnanPrash Nov 7, 2024
c59b0ff
Working Solution that connects to the HTTP and gRPC frontends
KrishnanPrash Nov 7, 2024
fdebf86
Cleaning up code
KrishnanPrash Nov 7, 2024
47f2f1a
Working basic test suite
KrishnanPrash Nov 7, 2024
51b304f
Update 'main' to track development of 2.53.0 / 24.12 (#7771)
mc-nv Nov 7, 2024
f0824e6
Renaming and Testing
KrishnanPrash Nov 7, 2024
78d1fe3
Undoing unrelated changes
KrishnanPrash Nov 7, 2024
78e15e1
Cleaning up includes
KrishnanPrash Nov 7, 2024
34dfcfe
Spacing
KrishnanPrash Nov 7, 2024
2ec0ac8
Merge branch 'main' into kprashanth-tritonfrontend-rfeatures
KrishnanPrash Nov 8, 2024
789db4c
removing unused imports
KrishnanPrash Nov 8, 2024
bb9d2ce
documentation and clean up
KrishnanPrash Nov 8, 2024
90283e3
documentation and clean up
KrishnanPrash Nov 8, 2024
c236d39
Clean up
KrishnanPrash Nov 8, 2024
bfd0080
testing: removed untested/extra restricted features
KrishnanPrash Nov 8, 2024
d2ecac1
test: OpenAI frontend invalid chat tokenizer network issue WAR (#7779)
kthui Nov 8, 2024
1791d2d
Comments and Docs with examples
KrishnanPrash Nov 12, 2024
639d027
Changing restricted_apis/protocols to restricted_features
KrishnanPrash Nov 12, 2024
8c788c3
Removing unused import
KrishnanPrash Nov 12, 2024
0a8eb32
Documentation and Formatting
KrishnanPrash Nov 12, 2024
60f22e4
Update ONNX version for generated models (#7785)
mc-nv Nov 13, 2024
3c7a263
test: RHEL Filesystem Tests (#7788)
fpetrini15 Nov 14, 2024
66026e5
Update model generation scenario (#7793) (#7797)
mc-nv Nov 15, 2024
d4d9ebc
fix: Fix L0_input_validation (#7800)
pskiran1 Nov 19, 2024
3815390
build: Support RHEL ORT TensorRT Execution Provider (#7812)
fpetrini15 Nov 20, 2024
8c32e58
Update src/python/tritonfrontend/_api/_error_mapping.py
KrishnanPrash Nov 20, 2024
2eb481d
ci: modifying stat count for `L0_server_status` (#7820)
KrishnanPrash Nov 21, 2024
fb89be7
build: update build.py to pass versions as input parameter and conver…
nvda-mesharma Nov 21, 2024
16154f2
fix: Resolve integer overflow in Load API file decoding (#7787)
pskiran1 Nov 22, 2024
eb1d290
feat: Enable deferred unregistering of shared memory regions after in…
pskiran1 Nov 25, 2024
9e181b9
ci: Fix L0_cuda_shared_memory (#7832)
pskiran1 Nov 26, 2024
3ac229e
Update `main` branch post 24.11 (#7829)
mc-nv Nov 26, 2024
82bcdc4
build: Update OpenVINO model generation script with new API (#7811)
yinggeh Nov 28, 2024
5e3fb3c
Revising docs, Support for removing Features, Adding Testing
KrishnanPrash Nov 30, 2024
46647dd
Merge branch 'main' into kprashanth-tritonfrontend-rfeatures
KrishnanPrash Nov 30, 2024
4394ef9
Clean up
KrishnanPrash Nov 30, 2024
bc25051
Adding decarator support for field validator and fixing variable names
KrishnanPrash Nov 30, 2024
08c37c4
Raising tritonfrontend error instead of tritonserver error
KrishnanPrash Dec 2, 2024
8ca82b1
Removing unused import
KrishnanPrash Dec 2, 2024
b454830
Update qa/L0_python_api/test_kserve.py
KrishnanPrash Dec 2, 2024
cffd318
fix: L0_sequence_batcher_cudashm (#7852)
oandreeva-nv Dec 4, 2024
788802c
fix: gRPC segfault due to Low Request Cancellation Timeout (#7840)
yinggeh Dec 9, 2024
0d6b9b4
ci: RHEL8 L0_backend_python Support (#7859)
fpetrini15 Dec 10, 2024
c87259a
fix: Lock httpx version to fix L0_openai--trtllm test failures (#7870)
rmccorm4 Dec 11, 2024
440c827
fix: Remove .Server subclass to reflect 24.12 tritonfrontend version …
rmccorm4 Dec 11, 2024
11af829
test: Fix requested output deleting extra outputs (#7866)
kthui Dec 11, 2024
fc0fe6b
Update generated Dockerfile (#7876)
mc-nv Dec 12, 2024
e8a6090
build: Adding b64 dependency to relevant targets (fix L0_build_varian…
KrishnanPrash Dec 13, 2024
8af3a38
Merge branch 'main' into kprashanth-tritonfrontend-rfeatures
KrishnanPrash Dec 13, 2024
44cbfad
Update qa/L0_python_api/test_kserve.py
KrishnanPrash Dec 13, 2024
8b3aa4e
removing redundant testing
KrishnanPrash Dec 13, 2024
141a440
Testing invalid value and no headers
KrishnanPrash Dec 13, 2024
fedcfac
fix: Handle dict type for content field in Chat Completions endpoint …
dongs0104 Dec 16, 2024
587f877
ci: Fix Windows CI Errors (#7837)
fpetrini15 Dec 17, 2024
9758344
docs: Re-structure User Guides for Discoverability (#7807)
statiraju Dec 18, 2024
e7642ee
error_mapping comment
KrishnanPrash Dec 19, 2024
4509ceb
comment formatting
KrishnanPrash Dec 19, 2024
16b4d1b
update and remove functionality added to RF
KrishnanPrash Dec 20, 2024
b52b29c
Adding testing
KrishnanPrash Dec 20, 2024
157e76a
Skipping grpc tests
KrishnanPrash Dec 20, 2024
9555a48
Merge branch 'main' into kprashanth-tritonfrontend-rfeatures
KrishnanPrash Dec 20, 2024
ae5d55a
Updating restricted features docs and adding comments
KrishnanPrash Dec 20, 2024
bf51bc1
removed unused import and added comment
KrishnanPrash Dec 20, 2024
47ab2af
Fix no endpoint/no file-system build
KrishnanPrash Dec 20, 2024
3ed8edb
Fix for Metrics/RestrictedFeatures path
KrishnanPrash Dec 20, 2024
ab34c9d
correcting ticket number
KrishnanPrash Dec 20, 2024
793d826
Update src/python/examples/example_model_repository/identity/config.p…
KrishnanPrash Dec 20, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
6 changes: 4 additions & 2 deletions .clang-format
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@
BasedOnStyle: Google

IndentWidth: 2
ContinuationIndentWidth: 2
ColumnLimit: 80
ContinuationIndentWidth: 4
UseTab: Never
MaxEmptyLinesToKeep: 2

Expand Down Expand Up @@ -34,4 +35,5 @@ BinPackArguments: true
BinPackParameters: true
ConstructorInitializerAllOnOneLineOrOnePerLine: false

IndentCaseLabels: true
IndentCaseLabels: true

24 changes: 24 additions & 0 deletions .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
---
name: Bug report
about: Create a report to help us improve
title: ''
labels: ''
assignees: ''

---

**Description**
A clear and concise description of what the bug is.

**Triton Information**
What version of Triton are you using?

Are you using the Triton container or did you build it yourself?

**To Reproduce**
Steps to reproduce the behavior.

Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well).

**Expected behavior**
A clear and concise description of what you expected to happen.
20 changes: 20 additions & 0 deletions .github/ISSUE_TEMPLATE/feature_request.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
---
name: Feature request
about: Suggest an idea for this project
title: ''
labels: ''
assignees: ''

---

**Is your feature request related to a problem? Please describe.**
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

**Describe the solution you'd like**
A clear and concise description of what you want to happen.

**Describe alternatives you've considered**
A clear and concise description of any alternative solutions or features you've considered.

**Additional context**
Add any other context or screenshots about the feature request here.
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
#### What does the PR do?
<!-- Describe your pull request here. Please read the text below the line, and make sure you follow the checklist.-->

#### Checklist
- [ ] I have read the [Contribution guidelines](#../../CONTRIBUTING.md) and signed the [Contributor License
Agreement](https://github.com/NVIDIA/triton-inference-server/blob/master/Triton-CCLA-v1.pdf)
- [ ] PR title reflects the change and is of format `<commit_type>: <Title>`
- [ ] Changes are described in the pull request.
- [ ] Related issues are referenced.
- [ ] Populated [github labels](https://docs.github.com/en/issues/using-labels-and-milestones-to-track-work/managing-labels) field
- [ ] Added [test plan](#test-plan) and verified test passes.
- [ ] Verified that the PR passes existing CI.
- [ ] I ran pre-commit locally (`pre-commit install, pre-commit run --all`)
- [ ] Verified copyright is correct on all changed files.
- [ ] Added _succinct_ git squash message before merging [ref](https://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html).
- [ ] All template sections are filled out.
- [ ] Optional: Additional screenshots for behavior/output changes with before/after.

#### Commit Type:
Check the [conventional commit type](https://github.com/angular/angular/blob/22b96b9/CONTRIBUTING.md#type)
box here and add the label to the github PR.
- [ ] build
- [ ] ci
- [ ] docs
- [ ] feat
- [ ] fix
- [ ] perf
- [ ] refactor
- [ ] revert
- [ ] style
- [ ] test

#### Related PRs:
<!-- Related PRs from other Repositories -->

#### Where should the reviewer start?
<!-- call out specific files that should be looked at closely -->

#### Test plan:
<!-- list steps to verify feature works -->
<!-- were e2e tests added?-->

#### Caveats:
<!-- any limitations or possible things missing from this PR -->

#### Background
<!-- e.g. what led to this change being made. this is optional extra information to help the reviewer -->

#### Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
- closes GitHub issue: #xxx
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
#### What does the PR do?
<!-- Describe your pull request here. Please read the text below the line, and make sure you follow the checklist.-->

#### Checklist
- [ ] PR title reflects the change and is of format `<commit_type>: <Title>`
- [ ] Changes are described in the pull request.
- [ ] Related issues are referenced.
- [ ] Populated [github labels](https://docs.github.com/en/issues/using-labels-and-milestones-to-track-work/managing-labels) field
- [ ] Added [test plan](#test-plan) and verified test passes.
- [ ] Verified that the PR passes existing CI.
- [ ] Verified copyright is correct on all changed files.
- [ ] Added _succinct_ git squash message before merging [ref](https://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html).
- [ ] All template sections are filled out.
- [ ] Optional: Additional screenshots for behavior/output changes with before/after.

#### Commit Type:
Check the [conventional commit type](https://github.com/angular/angular/blob/22b96b9/CONTRIBUTING.md#type)
box here and add the label to the github PR.
- [ ] build
- [ ] ci
- [ ] docs
- [ ] feat
- [ ] fix
- [ ] perf
- [ ] refactor
- [ ] revert
- [ ] style
- [ ] test

#### Related PRs:
<!-- Related PRs from other Repositories -->

#### Where should the reviewer start?
<!-- call out specific files that should be looked at closely -->

#### Test plan:
<!-- list steps to verify -->
<!-- were e2e tests added?-->

- CI Pipeline ID:
<!-- Only Pipeline ID and no direct link here -->

#### Caveats:
<!-- any limitations or possible things missing from this PR -->

#### Background
<!-- e.g. what led to this change being made. this is optional extra information to help the reviewer -->

#### Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
- closes GitHub issue: #xxx
13 changes: 13 additions & 0 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
Thanks for submitting a PR to Triton!
Please go the the `Preview` tab above this description box and select the appropriate sub-template:

* [PR description template for Triton Engineers](?expand=1&template=pull_request_template_internal_contrib.md)
* [PR description template for External Contributors](?expand=1&template=pull_request_template_external_contrib.md)

If you already created the PR, please replace this message with one of
* [External contribution template](https://raw.githubusercontent.com/triton-inference-server/server/main/.github/PULL_REQUEST_TEMPLATE/pull_request_template_external_contrib.md)
* [Internal contribution template](https://raw.githubusercontent.com/triton-inference-server/server/main/.github/PULL_REQUEST_TEMPLATE/pull_request_template_internal_contrib.md)

and fill it out.


84 changes: 84 additions & 0 deletions .github/workflows/codeql.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
# Copyright 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# * Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
# * Neither the name of NVIDIA CORPORATION nor the names of its
# contributors may be used to endorse or promote products derived
# from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

name: "CodeQL"

on:
pull_request:

jobs:
analyze:
name: Analyze
runs-on: ubuntu-latest
permissions:
actions: read
contents: read
security-events: write

strategy:
fail-fast: false
matrix:
language: [ 'python' ]
# CodeQL supports [ 'cpp', 'csharp', 'go', 'java', 'javascript', 'python', 'ruby' ]
# Learn more about CodeQL language support at https://aka.ms/codeql-docs/language-support

steps:
- name: Checkout repository
uses: actions/checkout@v3

# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v2
with:
languages: ${{ matrix.language }}
# If you wish to specify custom queries, you can do so here or in a config file.
# By default, queries listed here will override any specified in a config file.
# Prefix the list here with "+" to use these queries and those in the config file.

# Details on CodeQL's query packs refer to:
# https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/configuring-code-scanning#using-queries-in-ql-packs
queries: +security-and-quality


# Autobuild attempts to build any compiled languages (C/C++, C#, Go, or Java).
# If this step fails, then you should remove it and run the build manually (see below)
- name: Autobuild
uses: github/codeql-action/autobuild@v2

# Command-line programs to run using the OS shell.
# See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsrun

# If the Autobuild fails above, remove it and uncomment the following three lines.
# modify them (or add more) to build your code if your project, please refer to the EXAMPLE below for guidance.

# - run: |
# echo "Run, Build Application using script"
# ./location_of_script_within_repo/buildscript.sh

- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v2
with:
category: "/language:${{matrix.language}}"
45 changes: 45 additions & 0 deletions .github/workflows/pre-commit.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Copyright 2023-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# * Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
# * Neither the name of NVIDIA CORPORATION nor the names of its
# contributors may be used to endorse or promote products derived
# from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

name: pre-commit

on:
pull_request:

jobs:
pre-commit:
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Get modified files
id: modified-files
run: echo "modified_files=$(git diff --name-only -r HEAD^1 HEAD | xargs)" >> $GITHUB_OUTPUT
- uses: actions/setup-python@v3
- uses: pre-commit/[email protected]
with:
extra_args: --files ${{ steps.modified-files.outputs.modified_files }}
28 changes: 17 additions & 11 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,11 +1,17 @@
/bazel-bin
/bazel-ci_build-cache
/bazel-genfiles
/bazel-trtserver
/bazel-out
/bazel-serving
/bazel-tensorflow
/bazel-tensorflow_serving
/bazel-testlogs
/bazel-tf
/bazel-workspace
/build
/builddir
/.vscode
*.so
__pycache__
tmp
*.log
*.xml
test_results.txt
artifacts
cprofile
*.prof

# Test exclusions
qa/L0_openai/openai
tensorrtllm_models
custom_tokenizer
Loading