Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cohere embedding for ai-cache #1572

Merged
merged 5 commits into from
Dec 27, 2024
Merged

Add cohere embedding for ai-cache #1572

merged 5 commits into from
Dec 27, 2024

Conversation

ayanami-desu
Copy link
Contributor

Ⅰ. Describe what this PR did

Ⅱ. Does this pull request fix one issue?

fixes #1449

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

docker-compose.yml

services:
  envoy:
    image: higress-registry.cn-hangzhou.cr.aliyuncs.com/higress/gateway:v2.0.2
    entrypoint: /usr/local/bin/envoy
    command: -c /etc/envoy/envoy.yaml --component-log-level wasm:debug
    networks:
    - wasmtest
    ports:
    - "10000:10000"
    volumes:
    - ./envoy.yaml:/etc/envoy/envoy.yaml
    - ./main.wasm:/etc/envoy/main.wasm
    - ./ai.wasm:/etc/envoy/ai.wasm

networks:
  wasmtest: {}

envoy.yaml

admin:
  address:
    socket_address:
      protocol: TCP
      address: 0.0.0.0
      port_value: 9901
static_resources:
  listeners:
    - name: listener_0
      address:
        socket_address:
          protocol: TCP
          address: 0.0.0.0
          port_value: 10000
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                scheme_header_transformation:
                  scheme_to_overwrite: https
                stat_prefix: ingress_http
                # Output envoy logs to stdout
                access_log:
                  - name: envoy.access_loggers.stdout
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
                # Modify as required
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: local_service
                      domains: [ "*" ]
                      routes:
                        - match:
                            prefix: "/"
                          route:
                            cluster: deepseek
                            timeout: 300s
                http_filters:
                  - name: wasmtest
                    typed_config:
                      "@type": type.googleapis.com/udpa.type.v1.TypedStruct
                      type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
                      value:
                        config:
                          name: wasmtest
                          vm_config:
                            runtime: envoy.wasm.runtime.v8
                            code:
                              local:
                                filename: /etc/envoy/ai.wasm
                          configuration:
                            "@type": "type.googleapis.com/google.protobuf.StringValue"
                            value: |
                              {
                                "provider": {
                                  "type": "deepseek",
                                  "apiTokens": [
                                    "sk-"
                                  ]
                                }
                              }

                  - name: cache
                    typed_config:
                      "@type": type.googleapis.com/udpa.type.v1.TypedStruct
                      type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
                      value:
                        config:
                          name: cache
                          vm_config:
                            runtime: envoy.wasm.runtime.v8
                            code:
                              local:
                                filename: /etc/envoy/main.wasm
                          configuration:
                            "@type": "type.googleapis.com/google.protobuf.StringValue"
                            value: |
                              {
                                "embedding": {
                                  "type": "cohere",
                                  "serviceName": "cohere.dns",
                                  "apiKey": ""
                                },
                                "vector": {
                                  "type": "dashvector",
                                  "serviceName": "dashvector.dns",
                                  "collectionID": "first",
                                  "serviceHost": "vrs-cn-.dashvector.cn-hangzhou.aliyuncs.com",
                                  "apiKey": "sk-"
                                },
                                "cache": {
                                  "serviceName": "",
                                  "type": ""
                                }
                              }
                  - name: envoy.filters.http.router
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
  clusters:
   - name: deepseek
     connect_timeout: 30s
     type: LOGICAL_DNS
     dns_lookup_family: V4_ONLY
     lb_policy: ROUND_ROBIN
     load_assignment:
       cluster_name: deepseek
       endpoints:
         - lb_endpoints:
             - endpoint:
                 address:
                   socket_address:
                     address: api.deepseek.com
                     port_value: 443
     transport_socket:
       name: envoy.transport_sockets.tls
       typed_config:
         "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
         "sni": "api.deepseek.com"

   - name: outbound|443||cohere.dns
     connect_timeout: 30s
     type: LOGICAL_DNS
     dns_lookup_family: V4_ONLY
     lb_policy: ROUND_ROBIN
     load_assignment:
       cluster_name: outbound|443||cohere.dns
       endpoints:
         - lb_endpoints:
             - endpoint:
                 address:
                   socket_address:
                     address: api.cohere.com
                     port_value: 443
     transport_socket:
       name: envoy.transport_sockets.tls
       typed_config:
         "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
         "sni": "api.cohere.com"

   - name: outbound|443||dashvector.dns
     connect_timeout: 30s
     type: LOGICAL_DNS
     dns_lookup_family: V4_ONLY
     lb_policy: ROUND_ROBIN
     load_assignment:
       cluster_name: outbound|443||dashvector.dns
       endpoints:
         - lb_endpoints:
             - endpoint:
                 address:
                   socket_address:
                     address: vrs-cn-.dashvector.cn-hangzhou.aliyuncs.com
                     port_value: 443
     transport_socket:
       name: envoy.transport_sockets.tls
       typed_config:
         "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
         "sni": "vrs-cn-.dashvector.cn-hangzhou.aliyuncs.com"

image
image
image

Ⅴ. Special notes for reviews

@CLAassistant
Copy link

CLAassistant commented Dec 6, 2024

CLA assistant check
All committers have signed the CLA.

@ayanami-desu
Copy link
Contributor Author

上一个搞乱了一些东西,故重开

@johnlanni
Copy link
Collaborator

@ayanami-desu 麻烦签署一下 CLA,然后文件有一些冲突,麻烦fix一下

@ayanami-desu
Copy link
Contributor Author

冲突解决了

@@ -82,11 +76,11 @@ func (c *ProviderConfig) Validate() error {
if c.typ == "" {
return errors.New("embedding service type is required")
}
initializer, has := providerInitializers[c.typ]
_, has := providerInitializers[c.typ]
if !has {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个判定方法是不是需要改一下,直接判断 c.initializer 就行了吧?

Copy link
Collaborator

@CH3CHO CH3CHO left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 43.50%. Comparing base (ef31e09) to head (70ea2c3).
Report is 245 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1572      +/-   ##
==========================================
+ Coverage   35.91%   43.50%   +7.59%     
==========================================
  Files          69       76       +7     
  Lines       11576    12325     +749     
==========================================
+ Hits         4157     5362    +1205     
+ Misses       7104     6627     -477     
- Partials      315      336      +21     

see 69 files with indirect coverage changes

@CH3CHO CH3CHO merged commit 2d74c48 into alibaba:main Dec 27, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AI 缓存插件对接 Cohere https://docs.cohere.com/reference/embed
5 participants