Skip to content

Commit fc9d1b5

Browse files
Merge pull request #3348 from milvus-io/update-docs
update docs
2 parents 3d72edb + 35c2f18 commit fc9d1b5

File tree

1 file changed

+151
-35
lines changed

1 file changed

+151
-35
lines changed

site/en/userGuide/schema/array-of-structs.md

Lines changed: 151 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22
id: array-of-structs.md
33
title: "Array of Structs"
44
summary: "An Array of Structs field in an entity stores an ordered set of Struct elements. Each Struct in the Array shares the same pre-defined schema, comprising multiple vectors and scalar fields."
5-
beta: Milvus 2.6.4+
65
---
76

87
# Array of Structs
@@ -15,14 +14,14 @@ Here's an example of an entity from a collection that contains an Array of Struc
1514
{
1615
'id': 0,
1716
'title': 'Walden',
18-
'title_vector': [0.1, 0.2, 0.3, 0.4, 0.5]
17+
'title_vector': [0.1, 0.2, 0.3, 0.4, 0.5],
1918
'author': 'Henry David Thoreau',
2019
'year_of_publication': 1845,
2120
// highlight-start
2221
'chunks': [
2322
{
2423
'text': 'When I wrote the following pages, or rather the bulk of them...',
25-
'text_vector': [0.3, 0.2, 0.3, 0.2, 0.5]
24+
'text_vector': [0.3, 0.2, 0.3, 0.2, 0.5],
2625
'chapter': 'Economy',
2726
},
2827
{
@@ -88,7 +87,7 @@ In the example above, the `chunks` field is an Array of Structs field, and each
8887

8988
All vector fields in a collection must be indexed. To index a vector field within an Array of Structs field, Milvus uses an embedding list to organize the vector embeddings in each Struct element and indexes the entire embedding list as a whole.
9089

91-
You can use `HNSW` as the index type and any metric type listed below to build indexes for the embedding lists in an Array of Structs field.
90+
You can use `AUTOINDEX` or `HNSW` as the index type and any metric type listed below to build indexes for the embedding lists in an Array of Structs field.
9291

9392
<table>
9493
<tr>
@@ -97,23 +96,16 @@ In the example above, the `chunks` field is an Array of Structs field, and each
9796
<th><p>Remarks</p></th>
9897
</tr>
9998
<tr>
100-
<td rowspan="5"><p><code>HNSW</code></p></td>
99+
<td rowspan="3"><p><code>AUTOINDEX</code> (or <code>HNSW</code>)</p></td>
101100
<td><p><code>MAX_SIM_COSINE</code></p></td>
102-
<td rowspan="3"><p>For embedding lists of the following types:</p><ul><li><p>FLOAT_VECTOR</p></li><li><p>FLOAT16_VECTOR</p></li><li><p>BFLOAT16_VECTOR</p></li><li><p>INT8_VECTOR</p></li></ul></td>
101+
<td rowspan="3"><p>For embedding lists of the following types:</p><ul><li>FLOAT_VECTOR</li></ul></td>
103102
</tr>
104103
<tr>
105104
<td><p><code>MAX_SIM_IP</code></p></td>
106105
</tr>
107106
<tr>
108107
<td><p><code>MAX_SIM_L2</code></p></td>
109108
</tr>
110-
<tr>
111-
<td><p><code>MAX_SIM_HAMMING</code></p></td>
112-
<td rowspan="2"><p>For embedding lists of the BINARY_VECTOR type</p></td>
113-
</tr>
114-
<tr>
115-
<td><p><code>MAX_SIM_JACCARD</code></p></td>
116-
</tr>
117109
</table>
118110

119111
The scalar fields in the Array of Structs field do not support indexes.
@@ -193,7 +185,63 @@ schema.add_field("chunks", datatype=DataType.ARRAY, element_type=DataType.STRUCT
193185
```
194186

195187
```javascript
196-
// Node.js
188+
import { MilvusClient, DataType } from "@zilliz/milvus2-sdk-node";
189+
190+
const milvusClient = new MilvusClient("http://localhost:19530");
191+
192+
const schema = [
193+
{
194+
name: "id",
195+
data_type: DataType.INT64,
196+
is_primary_key: true,
197+
auto_id: true,
198+
},
199+
{
200+
name: "title",
201+
data_type: DataType.VARCHAR,
202+
max_length: 512,
203+
},
204+
{
205+
name: "author",
206+
data_type: DataType.VARCHAR,
207+
max_length: 512,
208+
},
209+
{
210+
name: "year_of_publication",
211+
data_type: DataType.INT64,
212+
},
213+
{
214+
name: "title_vector",
215+
data_type: DataType.FLOAT_VECTOR,
216+
dim: 5,
217+
},
218+
// highlight-start
219+
{
220+
name: "chunks",
221+
data_type: DataType.ARRAY,
222+
element_type: DataType.STRUCT,
223+
fields: [
224+
{
225+
name: "text",
226+
data_type: DataType.VARCHAR,
227+
max_length: 65535,
228+
},
229+
{
230+
name: "chapter",
231+
data_type: DataType.VARCHAR,
232+
max_length: 512,
233+
},
234+
{
235+
name: "text_vector",
236+
data_type: DataType.FLOAT_VECTOR,
237+
dim: 5,
238+
mmap_enabled: true,
239+
},
240+
],
241+
max_capacity: 1000,
242+
},
243+
// highlight-end
244+
];
197245
```
198246

199247
```bash
@@ -208,7 +256,7 @@ Indexing is mandatory for all vector fields, including both the vector fields in
208256

209257
The applicable index parameters vary depending on the index type in use. For details on applicable index parameters, refer to [Index Explained](index-explained.md) and the documentation pages specific to your selected index type.
210258

211-
To index an embedding list field, you need to set its index type to `HNSW`, and use `MAX_SIM_COSINE` as the metric type for Milvus to measure the similarities between embedding lists.
259+
To index an embedding list, you need to set its index type to `AUTOINDEX` or `HNSW`, and use `MAX_SIM_COSINE` as the metric type for Milvus to measure the similarities between embedding lists.
212260

213261
<div class="multipleCode">
214262
<a href="#python">Python</a>
@@ -225,21 +273,16 @@ index_params = MilvusClient.prepare_index_params()
225273
# Create an index for the vector field in the collection
226274
index_params.add_index(
227275
field_name="title_vector",
228-
index_type="IVF_FLAT",
276+
index_type="AUTOINDEX",
229277
metric_type="L2",
230-
params={"nlist": 128}
231278
)
232279

233280
# highlight-start
234281
# Create an index for the vector field in the element Struct
235282
index_params.add_index(
236283
field_name="chunks[text_vector]",
237-
index_type="HNSW",
284+
index_type="AUTOINDEX",
238285
metric_type="MAX_SIM_COSINE",
239-
params={
240-
"M": 16,
241-
"efConstruction": 200
242-
}
243286
)
244287
# highlight-end
245288
```
@@ -253,7 +296,25 @@ index_params.add_index(
253296
```
254297

255298
```javascript
256-
// Node.js
299+
await milvusClient.createCollection({
300+
collection_name: "books",
301+
fields: schema,
302+
});
303+
304+
const indexParams = [
305+
{
306+
field_name: "title_vector",
307+
index_type: "AUTOINDEX",
308+
metric_type: "L2",
309+
},
310+
// highlight-start
311+
{
312+
field_name: "chunks[text_vector]",
313+
index_type: "AUTOINDEX",
314+
metric_type: "MAX_SIM_COSINE",
315+
},
316+
// highlight-end
317+
];
257318
```
258319

259320
```bash
@@ -273,6 +334,11 @@ Once the schema and index are ready, you can create a collection that includes a
273334
</div>
274335

275336
```python
337+
client = MilvusClient(
338+
uri="http://localhost:19530",
339+
token="root:Milvus"
340+
)
341+
276342
client.create_collection(
277343
collection_name="my_collection",
278344
schema=schema,
@@ -289,7 +355,11 @@ client.create_collection(
289355
```
290356

291357
```javascript
292-
// Node.js
358+
await milvusClient.createCollection({
359+
collection_name: "books",
360+
fields: schema,
361+
indexes: indexParams,
362+
});
293363
```
294364

295365
```bash
@@ -311,15 +381,14 @@ After creating the collection, you can insert data that includes Arrays of Struc
311381
```python
312382
# Sample data
313383
data = {
314-
'id': 0,
315384
'title': 'Walden',
316-
'title_vector': [0.1, 0.2, 0.3, 0.4, 0.5]
385+
'title_vector': [0.1, 0.2, 0.3, 0.4, 0.5],
317386
'author': 'Henry David Thoreau',
318-
'year-of-publication': 1845,
387+
'year_of_publication': 1845,
319388
'chunks': [
320389
{
321390
'text': 'When I wrote the following pages, or rather the bulk of them...',
322-
'text_vector': [0.3, 0.2, 0.3, 0.2, 0.5]
391+
'text_vector': [0.3, 0.2, 0.3, 0.2, 0.5],
323392
'chapter': 'Economy',
324393
},
325394
{
@@ -346,7 +415,31 @@ client.insert(
346415
```
347416

348417
```javascript
349-
// Node.js
418+
{
419+
id: 0,
420+
title: "Walden",
421+
title_vector: [0.1, 0.2, 0.3, 0.4, 0.5],
422+
author: "Henry David Thoreau",
423+
"year-of-publication": 1845,
424+
chunks: [
425+
{
426+
text: "When I wrote the following pages, or rather the bulk of them...",
427+
text_vector: [0.3, 0.2, 0.3, 0.2, 0.5],
428+
chapter: "Economy",
429+
},
430+
{
431+
text: "I would fain say something, not so much concerning the Chinese and...",
432+
text_vector: [0.7, 0.4, 0.2, 0.7, 0.8],
433+
chapter: "Economy",
434+
},
435+
],
436+
},
437+
];
438+
439+
await milvusClient.insert({
440+
collection_name: "books",
441+
data: data,
442+
});
350443
```
351444

352445
```bash
@@ -433,6 +526,9 @@ def generate_record(record_id: int) -> Dict[str, Any]:
433526

434527
# Generate 1000 records
435528
data = [generate_record(i) for i in range(1000)]
529+
530+
# Insert the generated data
531+
client.insert(collection_name="my_collection", data=data)
436532
```
437533

438534
</details>
@@ -441,13 +537,13 @@ data = [generate_record(i) for i in range(1000)]
441537

442538
You can perform vector searches on the vector fields of a collection and in an Array of Structs.
443539

444-
Specifically, you can directly use the names of the vector fields within Struct elements as the value for the `anns_field` parameter in a search request, and use `EmbeddingList` to organize query vectors neatly.
540+
Specifically, you should concatenate the name of the Array of Structs field and those of the target vector fields within Struct elements as the value for the `anns_field` parameter in a search request, and use `EmbeddingList` to organize query vectors neatly.
445541

446542
<div class="alert note">
447543

448-
Milvus provides `EmbeddingList` to help you organize query vectors for searches against an embedding list in an Array of Structs more neatly.
544+
Milvus provides `EmbeddingList` to help you organize query vectors for searches against an embedding list in an Array of Structs more neatly. Each `EmbeddingList` contains at least a vector embedding and expects a number of topK entities in return.
449545

450-
However, `EmbeddingList` can used only in `search()` requests without range search or grouping search parameters, let alone `search_iterator()` requests.
546+
However, `EmbeddingList` can be used only in `search()` requests without range search or grouping search parameters, let alone `search_iterator()` requests.
451547

452548
</div>
453549

@@ -460,7 +556,7 @@ However, `EmbeddingList` can used only in `search()` requests without range sear
460556
</div>
461557

462558
```python
463-
from pymilvus import EmbeddingList
559+
from pymilvus.client.embedding_list import EmbeddingList
464560

465561
# each query embedding list triggers a single search
466562
embeddingList1 = EmbeddingList()
@@ -490,7 +586,20 @@ results = client.search(
490586
```
491587

492588
```javascript
493-
// Node.js
589+
const embeddingList1 = [[0.2, 0.9, 0.4, -0.3, 0.2]];
590+
const embeddingList2 = [
591+
[-0.2, -0.2, 0.5, 0.6, 0.9],
592+
[-0.4, 0.3, 0.5, 0.8, 0.2],
593+
];
594+
const results = await milvusClient.search({
595+
collection_name: "books",
596+
data: embeddingList1,
597+
anns_field: "chunks[text_vector]",
598+
search_params: { metric_type: "MAX_SIM_COSINE" },
599+
limit: 3,
600+
output_fields: ["chunks[text]"],
601+
});
602+
494603
```
495604

496605
```bash
@@ -583,7 +692,14 @@ print(results)
583692
```
584693

585694
```javascript
586-
// Node.js
695+
const results2 = await milvusClient.search({
696+
collection_name: "books",
697+
data: [embeddingList1, embeddingList2],
698+
anns_field: "chunks[text_vector]",
699+
search_params: { metric_type: "MAX_SIM_COSINE" },
700+
limit: 3,
701+
output_fields: ["chunks[text]"],
702+
});
587703
```
588704

589705
```bash

0 commit comments

Comments
 (0)