Embedders¶
Bases: Protocol
Embeds chunks and queries. embed(chunks) must preserve input order.
Source code in src/cenote/embedders/base.py
12 13 14 15 16 17 18 19 20 21 22 23 24 25 | |
model_id: str
property
¶
'provider:model_name', e.g. 'voyage:voyage-3'.
Deterministic unit-norm embedder for tests and demos (no network).
Source code in src/cenote/embedders/mock.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 | |
Voyage AI embedder with batching, concurrency, and optional rate limiting.
Splits inputs into batch_size-sized requests issued concurrently up to
max_concurrency. Pass requests_per_minute for free-tier accounts (300 RPM).
Source code in src/cenote/embedders/voyage.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 | |
Cohere embedder with batching, concurrency, and optional rate limiting.
Multilingual via embed-multilingual-v3.0. Same batching contract as
VoyageEmbedder; response shape uses Cohere v2 embeddings.float field.
Source code in src/cenote/embedders/cohere.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 | |
Bases: Protocol
Async key-value store for embedding vectors, keyed by (model_id, content_hash).
Source code in src/cenote/embedders/cache.py
20 21 22 23 24 25 26 27 28 29 30 31 32 | |
set_many(items: list[tuple[str, str, Vector]]) -> None
async
¶
Bulk write — single transaction in persistent backends.
items is a list of (model_id, content_hash, embedding) tuples.
Source code in src/cenote/embedders/cache.py
27 28 29 30 31 32 | |
Dict-backed EmbeddingCache. Suitable for tests and small workloads.
Source code in src/cenote/embedders/cache.py
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | |
Wraps an Embedder with an EmbeddingCache.
On embed(), checks cache per chunk by (model_id, content_hash); only misses are forwarded to the inner embedder. Output order matches input order.
Source code in src/cenote/embedders/cache.py
53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 | |