Skip to content
Snippets Groups Projects
Commit 0341a38e authored by FluorineDog's avatar FluorineDog Committed by yefu.chen
Browse files

Update doc :typo


Signed-off-by: default avatarFluorineDog <guilin.gou@zilliz.com>
parent c80641d4
No related branches found
No related tags found
No related merge requests found
......@@ -2,7 +2,7 @@
## 1. System Overview
In this section, we sketch the system design of Milvus , including data model, data organization, architecture, and state synchronization.
In this section, we sketch the system design of Milvus, including data model, data organization, architecture, and state synchronization.
......@@ -14,9 +14,9 @@ Milvus exposes the following set of data features to applications:
* a query language specifies data definition, data manipulation, and data query, where data definition includes create, drop, and data manipulation includes insert, upsert, delete, and data query falls into three types, primary key search, approximate nearest neighbor search (ANNS), ANNS with predicates.
The requests' execution order is strictly in accordance with their issue-time order. We take proxy's issue time as a requst's issue time. For a batch request, all its sub-requests share a same issue time. In cases there are multiple proxies, issue time from different proxies are regarded as coming from a central clock.
The requests' execution order is strictly in accordance with their issue-time order. We take proxy's issue time as a request's issue time. For a batch request, all its sub-requests share a same issue time. In cases there are multiple proxies, issue time from different proxies are regarded as coming from a central clock.
Transaction is currently not supported by Milvus.
Transaction is currently not supported by Milvus.
A batch insert/delete is guaranteed to become visible atomically.
......@@ -28,11 +28,11 @@ A batch insert/delete is guaranteed to become visible atomically.
<img src="./figs/data_organization.png" width=550>
In Milvus, 'collection' refers to the concept of table. A collection can be optionally divided into several 'partitions'. Both collection and partition are the basic execution scopes of queries. When use parition, users should clearly know how a collection should be partitioned. In most cases, parition leads to more flexible data management and more efficient quering. For a partitioned collection, queries can be executed both on the collection or a set of specified partitions.
In Milvus, 'collection' refers to the concept of table. A collection can be optionally divided into several 'partitions'. Both collection and partition are the basic execution scopes of queries. When use parition, users should clearly know how a collection should be partitioned. In most cases, parition leads to more flexible data management and more efficient querying. For a partitioned collection, queries can be executed both on the collection or a set of specified partitions.
Each collection or parition contains a set of 'segment groups'. Segment group is the basic unit of data-to-node mapping. It's also the basic unit of replica. For instance, if a query node failed, its segment groups will be redistributed accross other nodes. If a query node is overloaded, part of its segment groups will be migrated to underloaded ones. If a hot collection/partition is detected, its segment groups will be replicated to smooth the system load skewness.
'Segment' is the finest unit of data organization. It is where the data and indexes are actually kept. Each segment contains a set of rows. In order to reduce the memory footprint during a query execution and to fully utilize SIMD, the physical data layout within segments is organized in a column-based manner.
'Segment' is the finest unit of data organization. It is where the data and indexes are actually kept. Each segment contains a set of rows. In order to reduce the memory footprint during a query execution and to fully utilize SIMD, the physical data layout within segments is organized in a column-based manner.
......@@ -50,7 +50,7 @@ The query/write nodes are linked to the hash ring, with each node covers some po
The query nodes hold all the indexes in memory. Since building index is time-consuming, the query nodes will dump their index to disk (store engine) for fast failure recovery and cross node index copy.
The write nodes are stateless. They simply transforms the newly arrived WALs to binlog format, then append the binlog to store enginey.
The write nodes are stateless. They simply transforms the newly arrived WALs to binlog format, then append the binlog to store enginey.
Note that not all the components are necessarily replicated. The system provides failure tolerance by maintaining multiple copies of WAL and binlog. When there is no in-memory index replica and there occurs a query node failure, other query nodes will take over its indexes by loading the dumped index files, or rebuilding them from binlog and WALs. The links from query nodes to the hash ring will also be adjusted such that the failure node's input WAL stream can be properly handled by its neighbors.
......@@ -70,7 +70,7 @@ For better throughput, Milvus allows asynchronous state synchronization between
#### 1.5 Stream and Time
In order to boost throughput, we model Milvus as a stream-driven system.
In order to boost throughput, we model Milvus as a stream-driven system.
......@@ -90,7 +90,7 @@ type Component interface {
GetComponentStates(ctx context.Context) (*internalpb2.ComponentStates, error)
GetTimeTickChannel(ctx context.Context) (*milvuspb.StringResponse, error)
GetStatisticsChannel(ctx context.Context) (*milvuspb.StringResponse, error)
}
}
```
* *GetComponentStates*
......@@ -113,7 +113,7 @@ type ComponentInfo struct {
}
type ComponentStates struct {
State *ComponentInfo
State *ComponentInfo
SubcomponentStates []*ComponentInfo
Status *commonpb.Status
}
......
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment