You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: site/en/userGuide/schema/json-shredding.md
+12-14Lines changed: 12 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -88,13 +88,13 @@ The final phase leverages the optimized storage layout to intelligently select t
88
88
89
89
## Enable JSON shredding
90
90
91
-
To activate the feature, set `common.enabledJSONShredding` to `true` in your `milvus.yaml` configuration file. New data will automatically trigger the shredding process.
91
+
To activate the feature, set `common.enabledJSONKeyStats` to `true` in your `milvus.yaml` configuration file. New data will automatically trigger the shredding process.
92
92
93
93
```yaml
94
94
# milvus.yaml
95
95
...
96
96
common:
97
-
enabledJSONShredding: true # Indicates whether to enable JSON key stats build and load processes
97
+
enabledJSONKeyStats: true # Indicates whether to enable JSON key stats build and load processes
98
98
...
99
99
```
100
100
@@ -112,34 +112,34 @@ For most users, once JSON shredding is enabled, the default settings for other p
<td><p>Determines whether Milvus uses mmap when loading shredding data.</p><p>For details, refer to <a href="mmap.md">Use mmap</a>.</p></td>
129
129
<td><p>true</p></td>
130
130
<td><p>This setting is generally optimized for performance. Only adjust it if you have specific memory management needs or constraints on your system.</p></td>
<td><p>The maximum number of JSON keys that will be stored in shredded columns. </p><p>If the number of frequently appearing keys exceeds this limit, Milvus will prioritize the most frequent ones for shredding, and the remaining keys will be stored in the shared column.</p></td>
135
135
<td><p>1024</p></td>
136
136
<td><p>This is sufficient for most scenarios. For JSON with thousands of frequently appearing keys, you may need to increase this, but monitor storage usage.</p></td>
<td><p>The minimum occurrence ratio a JSON key must have to be considered for shredding into a shredded column.</p><p>A key is considered frequently appearing if its ratio is above this threshold.</p></td>
141
141
<td><p>0.3</p></td>
142
-
<td><p><strong>Increase</strong> (e.g., to 0.5) if the number of keys that meet the shredding criteria exceeds the <code>dataCoord.jsonShreddingMaxColumns</code> limit. This makes the threshold stricter, reducing the number of keys that qualify for shredding.</p><p><strong>Decrease</strong> (e.g., to 0.1) if you want to shred more keys that appear less frequently than the default 30% threshold.</p></td>
142
+
<td><p><strong>Increase</strong> (e.g., to 0.5) if the number of keys that meet the shredding criteria exceeds the <code>dataCoord.jsonStatsMaxShreddingColumns</code> limit. This makes the threshold stricter, reducing the number of keys that qualify for shredding.</p><p><strong>Decrease</strong> (e.g., to 0.1) if you want to shred more keys that appear less frequently than the default 30% threshold.</p></td>
143
143
</tr>
144
144
</table>
145
145
@@ -231,15 +231,13 @@ This test focused on querying sparse, nested keys that fall into the "shared" ca
231
231
232
232
1. Next, verify that the data has been loaded by running `show loaded-json-stats` on the query node. The output will display details about the loaded shredded data for each query node.
233
233
234
-
-**What if I encounter an error?**
235
-
236
-
If the build or load process fails, you can quickly disable the feature by setting `common.enabledJSONShredding=false`. To clear any remaining tasks, use the `remove stats-task <task_id>` command in [Birdwatcher](birdwatcher_usage_guides.md). If a query fails, set `common.usingjsonShreddingForQuery=false` to revert to the original query path, bypassing the shredded data.
237
-
238
234
-**How do I select between JSON shredding and JSON indexing?**
239
235
240
236
-**JSON shredding** is ideal for keys that appear frequently in your documents, especially for complex JSON structures. It combines the benefits of columnar storage and inverted indexing, making it well-suited for read-heavy scenarios where you query many different keys. However, it is not recommended for very small JSON documents as the performance gain is minimal. The smaller the proportion of the key's value to the total size of the JSON document, the better the performance optimization from shredding.
241
237
242
238
-**JSON indexing** is better for targeted optimization of specific key-based queries and has lower storage overhead. It's suitable for simpler JSON structures. Note that JSON shredding does not cover queries on keys inside arrays, so you need a JSON index to accelerate those.
243
239
244
-
For details, refer to [JSON Field Overview](json-field-overview.md#Next-Accelerate-JSON-queries).
240
+
-**What if I encounter an error?**
241
+
242
+
If the build or load process fails, you can quickly disable the feature by setting `common.enabledJSONKeyStats=false`. To clear any remaining tasks, use the `remove stats-task <task_id>` command in [Birdwatcher](birdwatcher_usage_guides.md). If a query fails, set `common.usingJsonStatsForQuery=false` to revert to the original query path, bypassing the shredded data.
0 commit comments