Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
2
210310676
Manage
Activity
Members
Labels
Plan
Wiki
Code
Merge requests
0
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Deploy
Releases
Package Registry
Operate
Terraform modules
Analyze
Value stream analytics
Contributor analytics
Repository analytics
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
Summer2021
210310676
Commits
03a20523
Commit
03a20523
authored
4 years ago
by
XuanYang-cn
Committed by
zhenshan.cao
4 years ago
Browse files
Options
Downloads
Patches
Plain Diff
Update DataNode recovery design (#5578)
Signed-off-by:
yangxuan
<
xuan.yang@zilliz.com
>
parent
aa8a0383
No related branches found
No related tags found
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
docs/design_docs/datanode_recovery_design_0513_2021.md
+30
-49
30 additions, 49 deletions
docs/design_docs/datanode_recovery_design_0513_2021.md
with
30 additions
and
49 deletions
docs/design_docs/datanode_recovery_design_0513_2021.md
+
30
−
49
View file @
03a20523
# DataNode Recovery Design
update: 5.21.2021, by
[
Goose
](
https://github.com/XuanYang-cn
)
update: 5.21.2021, by
[
Goose
](
https://github.com/XuanYang-cn
)
update: 6.03.2021, by
[
Goose
](
https://github.com/XuanYang-cn
)
## Objectives
## What's DataNode?
DataNode processes insert data and persists them.
DataNode is stateless. It does whatever DataService tells, so recovery is not a difficult thing for datanode.
Once datanode subscribes certain vchannels, it starts working till crash. So the key to recovery is consuming
vchannels at the right position. What's processed no longer need to be processed again, what's not processed is
the key.
DataNode is based on flowgraph, each flowgraph cares about only one vchannel. There're ddl messages, dml
messages, and timetick messages inside one vchannel, FIFO log stream.
What's the line between processed or not for DataNode? Wether the data is flushed into persistent storage, which's
the only job of DataNode. So recovering a DataNode needs the last positions of flushed data in every vchannels.
Luckily, this information will be told by DataService, DataNode only worries about updating positions after flushing.
One vchannel only contains dml messages of one collection. A collection consists of many segments, hence
a vchannel contains dml messsages of many segments.
**
Most importantly, the dml messages of the same segment
can appear in anywhere in vchannel.
**
There's more to fully recover a DataNode. DataNode replicates collection schema in memory to decode and encode
data. Once it recovers to an older position of insert channels, it needs the collection schema snapshots from
that exactly position. Luckily again, the snapshots will be provided via MasterService.
## What does DataNode recovery really mean?
So DataNode needs to achieve the following 3 objectives.
DataNode is stateless, but vchannel has states. DataNode's statelessness is guranteed by DataService, which
means the vchannel's states is maintained by DataService. So DataNode recovery has no different as starting.
### 1. Service Registration
So what's DataNode's starting procedure?
## Objectives
### 1. Serveice Registration
DataNode registers itself to Etcd after grpc server started, in
*INITIALIZING*
state.
### 2. Service
d
iscovery
### 2. Service
D
iscovery
DataNode discovers DataService and MasterService, in
*HEALTHY*
state.
DataNode discovers DataService and MasterService, in
*HEALTHY*
and
*IDLE*
state.
### 3. Recovery
state
### 3.
Flowgraph
Recovery
After stage 1&2, DataNode is healthy but IDLE. D
ata
N
ode
starts to work until the following happens
.
The detailed design can be found at
[
d
ata
n
ode
flowgraph recovery design
](
datanode_flowgraph_recovery_design_0604_2021.md
)
.
-
DataService info the vchannels and positions.
After DataNode subscribes to a stateful vchannel, DataNode starts to work, or more specifically, flowgraph starts to work.
-
DataNode replicates the snapshots of collection schema at the positions to which these vchannel belongs.
Vchannel is stateful because we don't want to process twice what's already processed. And a "processed" message means its
already persistant. In DataNode's terminology, a message is processed if it's been flushed.
-
DataNode initializes flowgraphs and subscribes to these vchannels
DataService tells DataNode stateful vchannel infos through RPC
`WatchDmChannels`
, so that DataNode won't process
the same messages over and over again. So flowgraph needs ability to comsume messages in the middle of a vchannel.
There're some problems I haven't thought of.
DataNode tells DataService vchannel states after each flush through RPC
`SaveBinlogPaths`
, so that DataService
keep the vchannel states update.
-
What if DataService is unavaliable, by network failure, DataService crashing, etc.
-
What if MasterService is unavaliable, by network failure, MasterService crashing, etc.
-
What if MinIO is unavaliable, by network failure.
##
TODO
##
Some of the following interface/proto designs are outdate, will be updated soon
### 1. DataNode no longer interacts with Etcd except service registering
...
...
@@ -51,15 +55,6 @@ There're some problems I haven't thought of.
!
[
datanode_design
](
graphs/datanode_design_01.jpg
)
##### Auto-flush with manul-flush
Manul-flush means that the segment is sealed, and DataNode is told to flush by DataService. The completion of
manul-flush requires ddl and insert data both flushed, and a flush completed message will be published to
msgstream by DataService. In this case, not only do binlog paths need to be stored, but also msg-positions.
Auto-flush means that the segment isn't sealed, but the buffer of insert/ddl data in DataNode is full,
DataNode automatically flushs these data. Those flushed binlogs' paths are buffered in DataNode, waiting for the next
manul-flush and upload to DataServce together.
##### DataService RPC Design
...
...
@@ -100,20 +95,6 @@ message DDLBinlogMeta {
}
```
#### **O1-2** DataNode registers itself to Etcd when started
### 2. DataNode gets start and end MsgPositions of all channels, and report to DataService after flushing
**O2-1**
. Set start and end positions while publishing ddl messages. 0.5 Day
**O2-2**
. [after
**O4-1**
] Get message positions in flowgraph and pass through nodes, report to DataService along with binlog paths. 1 Day
**O2-3**
. [with
**O1-1**
] DataNode is no longer aware of whether if segment flushed, so SegmentFlushed messages should be sent by DataService. 1 Day
### 3. DataNode recovery
**O3-1**
. Flowgraph is initialized after DataService called WatchDmChannels, flowgraph is healthy if MasterService is available. 2 Day
### 4. DataNode with collection with flowgraph with vchannel designs
#### The winner
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment