14 <<<samba-chilli-flavour-crop-bright-1280.jpg,height=.8\textheight>>>
16 <<<samba-chilli-flavour-crop-bright-1280.jpg,height=.8\textheight>>>
18 <<<samba-chilli-flavour-crop-bright-1280.jpg,height=.8\textheight>>>
26 <<<samba-kisses-better-selection.jpg,height=.8\textheight>>>
30 ==== Short History ====
33 * 2.0: 1999/01: domain-member, +SWAT
34 * 2.2: 2001/04: NT4-DC
35 * 3.0: 2003/09: AD-member, Samba4 project started
36 * 3.2: 2008/07: GPLv3, experimental clustering
37 * 3.3: 2009/01: clustering
38 * 3.4: 2009/07: merged S3+S4 code
39 * 3.5: 2010/03: experimental SMB 2.0
40 * 3.6: 2011/09: SMB 2.0
41 * 4.0: 2012/12: AD/DC, SMB 2.0 durable handles, 2.1, 3.0
42 * 4.1: 2013/10: stability
43 * 4.2: soon: AD trusts, performance, scalability, CTDB included
45 ==== Release Stream ====
49 <<<samba-release-stream_exp.png,width=.8\textwidth>>>
57 <<<samba-chilli-flavour-crop-bright-1280.jpg,height=.8\textheight>>>
60 ==== Samba File Serving Topics ====
62 * SMB features (SMB3...)
65 * Interop (Protocols, NFS, AFP, ...)
66 * special file systems support
69 %%==== Other Samba Topics ====
71 %%* Auth/Domain Member
76 ==== File Server Habitat ====
78 * scalable file server
79 ** scale-out: powerful clusters
80 ** scale-down: low-end boxes
84 * (samba $\leftrightarrow$ cifs.ko alternative to nfs?...)
90 ** low profile platforme (arm, ...)
92 * database performance
96 ** concrete db design (notify, open files, ...)
97 * scaling out (1 system)
99 *** async I/O with helper threads
100 * cluster performance
105 * multi-protocol access
106 ** nfs (kernel, ganesha, ...)
114 ** SMB2+ unix-extensions
122 ==== File Server Layout/Scope ====
125 <<<samba-layers.jpg,height=.8\textheight>>>
128 ==== SMB Features ====
131 ** durable file handles [4.0]
133 ** multi-credit / large mtu [4.0]
134 ** dynamic reauthentication [4.0]
136 ** resilient file handles [ever?]
138 ** new crypto (sign/encrypt) [4.0]
139 ** secure negotiation [4.0]
140 ** durable handles v2 [4.0]
141 ** persistent file handles [planning]
142 ** multi-channel [WIP+]
143 ** SMB direct [designed/starting]
144 ** cluster features [designing]
146 ** storage features [WIP]
151 ==== Clusterd Samba / CTDB (SOFS since 2007) ====
155 <<<design-ctdb-three-nodes.png,width=.9\textwidth>>>
165 %%% * new crypto (signing, transport encryption)
166 %%% * persistent file handles
168 %%% * RDMA transport (SMB direct)
169 %%% * storage features
172 %%% ** transparent failover (continuous availability)
173 %%% ** all-active (scale-out)
176 %%% ==== SMB3 - Goals ====
181 %%% * fault tolerance / reliability
182 %%% * performance / throughput / scaling
183 %%% * focus on support for server workloads \\ %
184 %%% (as opposed to workstation workloads)
185 %%% * especially support for:
189 %%% ** replace block storage in data center
190 %%% ** block (SCSI) over SMB
193 %%% ==== Requirements for Hyper-V ====
198 %%% * minimum requirements:
200 %%% ** is that really all??? - maybe resilient file handles..
203 %%% * desired features:
204 %%% ** cluster ($\ge 2$ nodes)
205 %%% ** CA / persistent handles
206 %%% ** RDMA / SMB direct
210 %%% ==== SMB Protocol in Samba ====
218 %%% ** experimental incomplete support for SMB 2.0
220 %%% ** official support for SMB 2.0
221 %%% ** missing: durable handles
222 %%% ** default server max proto: SMB 1
224 %%% ** SMB 2.0: complete with durable handles
225 %%% ** SMB 2.1: basis, multi-credit, dynamic reauthentication
226 %%% ** SMB 3.0: basis, crypto, secure negotiation, durable v2
227 %%% ** default server max proto: SMB 3.0
229 %%% ** SMB 3.02: basic
243 %%% <<<samba-chilli-flavour-crop-bright-1280.jpg,height=.8\textheight>>>
248 ==== Multi-Channel - Windows/Protocol ====
251 * find interfaces with interface discovery: \\ %
252 @FSCTL\_QUERY\_NETWORK\_INTERFACE\_INFO@
253 * bind additional TCP (or RDMA) connection (channel) to established SMB3 session (session bind)
254 * bind (TCP) connections of same quality
255 * bind only to a single node
256 * replay / retry mechanisms, epoch numbers
259 ==== Multi-Channel - Samba ====
262 * samba/smbd: multi-process
263 ** process $\Leftrightarrow$ tcp connection
264 ** ==> transfer new connection to existing smbd
265 ** use fd-passing (sendmsg/recvmsg)
268 * preparation: messaging rewrite using unix dgm sockets with sendmsg [DONE,Volker]
269 * add fd-passing [WIP]
270 * transfer connection already in negprot (ClientGUID) [TODO]
271 * implement channel epoch numbers [started]
272 * implemnt interface discovery [TODO]
275 ==== SMB Direct (RDMA) ====
279 ** requires multi-channel
280 ** start with TCP, bind an RDMA channel
281 ** reads and writes use RDMB write/read
282 ** protocol/metadata via send/receive
285 * wireshark dissector: [DONE (Metze)]
289 ** prereq: multi-channel / fd-passing
290 ** buffer / transport abstractions [TODO]
291 ** central daemon (or kernel module) to serve as RDMA "proxy" \\ %
292 (libraries: not fork safe and no fd-passing)
295 ==== SMB Direct (RDMA) - Plan ====
298 * smbd-d (?) listens for RDMA connection
299 * main smbd listens for TCP connection
300 * main smbd listens (for RDMA) via unix socket connect to smbd-d
301 * client connects via TCP --> smbd forks child smbd (c1)
302 * client connects via RDMA to smbd-d
303 * smbd-d notifies main smbd and transfers connection info
304 * smbd forks child (c2) that inherits connection to smbd-d
305 * c2 smbd passes [connection to smbd-d] to c1 (via ClientGUID) and exits
306 * c1 establishes mmap area with smbd-d
307 * client does rdma calls to smbd-d
308 ** metadata and protocol calls are transferred via socket to tcp-smbd
309 ** rdma read/write directly to tcp-smbd via mmap area
312 %%% ==== Persistent Handles ====
317 %%% * like durable file handles with strong guarantees
318 %%% * framework is already there in samba (by support for durable v2)
319 %%% ** ==> easy to satisfy at the protocol level
322 %%% * the difficulty lies in implementing the guarantees
323 %%% ** need make metadata persistent
324 %%% ** but don't kill performance!
325 %%% ** persistent tdbs !would! kill performance
327 %%% *** need to be sync
328 %%% *** record-level transactions (instead of db-level)
329 %%% *** only replicate to some nodes, not all
333 ==== Clustering Concepts (Windows) ====
339 ** (``traditional'') failover cluster (active-passive)
340 ** protocol: @SMB2\_SHARE\_CAP\_CLUSTER@
342 *** runs off a cluster (failover) volume
343 *** offers the Witness service
347 ** scale-out cluster (all-active!)
348 ** protocol: @SMB2\_SHARE\_CAP\_SCALEOUT@
350 ** Windows: runs off a cluster shared volume (implies cluster)
353 * Continuous Availability (CA):
354 ** transparent failover, persistent handles
355 ** protocol: @SMB2\_SHARE\_CAP\_CONTINUOUS\_AVAILABILITY@
356 ** can independently turned on on any cluster share (failover or scale-out)
357 ** ==> changed client retry behaviour!
360 %%% ==== Clustering -- Controlling Flags from Windows ====
365 %%% * a share on a cluster carries
366 %%% ** @SMB2\_SHARE\_CAP\_CLUSTER@ $\Leftrightarrow$ the shared FS is a cluster volume.
369 %%% * a share on a cluster carries
370 %%% ** @SMB2\_SHARE\_CAP\_SCALEOUT@ $\Leftrightarrow$ the shared FS is a CSV
371 %%% *** implies @SMB2\_SHARE\_CAP\_CLUSTER@
374 %%% * independently settable on a clustered share:
375 %%% ** @SMB2\_SHARE\_CAP\_CONTINUOUS\_AVAILABILITY@
376 %%% *** implies @SMB2\_SHARE\_CAP\_CLUSTER@
379 ==== Clustering -- Server Behaviour ====
384 * @SMB2\_SHARE\_CAP\_CLUSTER@:
385 ** run witness service (RPC)
386 ** client can register and get notified about resource changes
389 * @SMB2\_SHARE\_CAP\_SCALEOUT@:
390 ** do not grant batch oplocks, write leases, handle leases
391 ** ==> no durable handles unless also CA
394 * @SMB2\_SHARE\_CAP\_CONTINUOUS\_AVAILABILITY@:
395 ** offer persistent handles
396 ** timeout from durable v2 request
400 ==== Clustering -- Client Behaviour (Win8) ====
406 * @SMB2\_SHARE\_CAP\_CLUSTER@:
407 ** clients happily work if witness is not available
410 * @SMB2\_SHARE\_CAP\_SCALEOUT@:
411 ** clients happily connect if @CLUSTER@ is not set.
412 ** clients DO request oplocks/leases/durable handles
413 ** clients are not confused if they get these
416 * @SMB2\_SHARE\_CAP\_CONTINUOUS\_AVAILABILITY@:
417 ** clients happily connect if @CLUSTER@ is not set.
418 ** clients typically request persistent handle with RWH lease
423 %%%Win8 sends @SMB2\_FLAGS\_REPLAY\_OPERATION@ in writes and reads (from 2nd in a row) \\ %
424 %%%$\Leftrightarrow$ \\ %
425 %%%The server announces @SMB2\_CAP\_PERSISTENT\_HANDLES@.
428 %%% ==== Clustering -- Client Behaviour (Win8) : Retries ====
431 %%% * Test: Win8 against slightly pimped Samba (2 IPs)
434 %%% * Server-Matrix (on/off):
435 %%% ** persistent handle cap
436 %%% ** durable handles
437 %%% ** cluster share cap
443 %%% ** connect to share with explorer
444 %%% ** start copying file (2G)
446 %%% ** wait for the client to pop up an error dialog
451 %%% ==== Clustering -- Client Behaviour (Win8) : Retries ====
454 %%% * only two different retry characteristics: CA $\leftrightarrow$ non-CA
458 %%% ** 3 consecutive attempt rounds:
459 %%% *** for each of the two IPs: \\ %
461 %%% three tcp syn attempts to IP with 0.5 sec breaks
462 %%% ** ==> some 2.1 seconds for 1 round
463 %%% ** between attempts:
464 %%% ** dns, ping, arp ... 5.8 seconds
465 %%% ** ==> _red_18 seconds_
469 %%% ** retries attempt rounds from above for _red_14 minutes_
479 %%% <<<samba-chilli-flavour-crop-bright-1280.jpg,height=.8\textheight>>>
484 ==== Clustering with Samba/CTDB ====
487 * all-active SMB-cluster with Samba and CTDB... \\ %
488 +<3->{...since 2007! \smiley }
491 * transparent for the client
493 *** metadata and messaging engine for Samba in a cluster
494 *** plus cluster resource manager (IPs, services...)
495 ** client only sees one ``big'' SMB server
496 ** we could not change the client!...
497 ** works ``well enough''
501 ** how to integrate SMB3 clustering with Samba/CTDB
502 ** good: rather orthogonal
503 ** ctdb-clustering transparent mostly due to management
506 ==== Witness Service ====
510 ** monitoring of availability of resources (shares, NICs)
511 ** server asks client to move to another resource
515 ** available on a Windows SMB3 share $\Leftrightarrow$ @SMB2\_SHARE\_CAP\_CLUSTER@
516 ** but clients happily connect w/o witness
519 * status in Samba [WIP (Metze, Gregor Beck)]:
520 ** async RPC: WIP, good progress ($\Rightarrow$ Metze's talk)
521 ** wireshark dissector: essentially done
522 ** client: in @rpcclient@ - done
523 ** server: dummy PoC / tracer bullet implementation done
524 ** CTDB: changes / integration needed
532 %%% !@https://wiki.samba.org/index.php/SMB3@!
542 %%% [[[.6\textwidth]]]
544 %%% [[[.3\textwidth]]]
545 %%% <<<samba-chilli-flavour-crop-bright-1280.jpg,height=.8\textheight>>>
571 <<<samba-chilli-flavour-crop-bright-1280.jpg,height=.8\textheight>>>
579 %%% %%%% @obnox\@samba.org / ma\@sernet.de@
581 %%% %%%% \vspace*{1em}
583 %%% %%%% %%%<<<ernie-und-bert-1.jpg,width=.65\textwidth>>>
584 %%% %%%% <<<samba-kisses-better-selection.jpg,width=.6\textwidth>>>