DæmonNews: News and views for the BSD community

Daemon News Ezine BSD News BSD Mall BSD Support Forum BSD Advocacy BSD Updates

Daemon's Advocate

By Robert Watson <rwatson@freebsd.org>

This is a small collection of summaries of patches to the SMP Networking stack. This is part of a larger technical news feed detailing the work being done for those who are interested in tracking its progress.

There is also an RSS feed available.

20040828:
	Integrate to CVS HEAD.

	Change the default setting for debug_mpsafenet to 1 from 0, which
	  runs the network stack without Giant unless explicitly specified
	  otherwise.  Add "options NET_WITH_GIANT", a kernel option that sets
	  the default back to 0.

	Rewrite locking found in /dev/random entropy collection to coalesce a
	  number of locks into a single harvest_mtx, which will reduce the
	  number of mutex operations when gathering entropy from 4 to 2, and
	  reduce to O(2) the number of mutex operations in the entropy thread
	  when processing gathered entropy from O(4N).  While this
	  potentially increases contention, it will dramatically decrease
	  cost in the uncontended case.

	Add NET_NEEDS_GIANT("component") declaration that allows kernel
	  network components to declare a dependence on Giant for correct
	  operation.  This declaration is checked for the kernel and modules
	  during boot to determine whether debug.mpsafenet should be forced
	  to 0 to ensure correct operation, and if so, a warning is
	  displayed.  If it's too late in the boot, we display a difference
	  warning and continue.

	Instrument ng_tty, IPX, KAME IPSEC to declare dependence on Giant.

	Remove unused random tick handling in KAME IPSEC to reduce need for
	  synchronization.

	Merged removal of UNIX domain socket locks from unp_gc() to FreeBSD
	  CVS repository.

20040825:
	Integrate to CVS HEAD.

	Merged removal of conditional socket buffer locking in socket kqueue
	  filters to FreeBSD CVS.

	Removed references to thread0 (for its credential) in ng_ksocket.c,
	  as they are in practice unused in HEAD due to curthread always
	  being defined as non-NULL.  However, the use of a thread here is
	  improper, and probably suggests replacing thread references with
	  credential references at a number of points in the protocol API.

	Merged replacement of &thread0 with curthread in nfs_timer.c to
	  prevent use of non-curthread for suser() by UDP6.

	Merged in6_prefix.[ch] and router renumbering removal from George
	  Neville-Neil to FreeBSD CVS, which simplifies IPv6 locking.

	Merged marking of if_dc as IFF_NEEDSGIANT to FreeBSD CVS.  While
	  if_dc contains locking, it's disabled by default and has not been
	  reviewed or adequately tested.

	Merged fix to an NFS server bug where Giant was improperly acquired
	  instead of released in nfsrv_link(), resulting in an assertion
	  failure with INVARIANTS (or a delayed failure for non-INVARIANTS in
	  the event nfsd exited).

20040822:
	Integrate to CVS HEAD.

	Merged removal of Giant assertion in setugidsafety() to FreeBSD CVS.

	Merged removal of Giant assertion in kqueue_close() to FreeBSD CVS.

	Removed conditional locking in socket kqueue filters, as the socket
	  buffer mutex will now always be held at that point.

	Merged UDP link header mbuf allocation optimization to FreeBSD CVS.

	Merged in6_pcbnotify() bug fix to conditionally unlock based on
	  return value of the notify routine to FreeBSD CVS.

20040819:
	Integrate to CVS HEAD.

	Merged GIANT_REQUIRED for fwe_start() to FreeBSD CVS.

	Updated annotations of GIANT_REQUIRED in close()-related functions,
	  including setugidsafety(), fdclosexec(), as KQueue now requires
	  Giant less.

	Merged UNP_UNLOCK_ASSERT() to FreeBSD CVS.

	Merged assertion of INP_LOCK_ASSERT() in inp_rehashpcb() to FreeBSD
	  CVS.

	Merged push-down of udp_send() locks into udp_output(), as well as
	  avoidance of pcbinfo locking in the bound/connected/non-sendto send
	  case to FreeBSD CVS.

	Possibly correct a bug involving TCP and in6_pcbnotify() where the
	  return value of the notify routine was not being used to determine
	  if the inpcb should be unlocked or not by the caller.  Problem
	  reported by Jun Kuriyama.

	Added initial task list for further netinet6 locking.

20040817:
	Integrate to CVS HEAD: ipfw, dummynet, etc using PFIL_HOOKS.
	  6.x-CURRENT.

20040816:
	Integrate to CVS HEAD.

	Cleanup to Giant-reduced close() following KQueue locking integration.

	Merged UNIX domain socket lock over sotounpcb() changes to FreeBSD
	  CVS.

20040815:
	Integrate to CVS HEAD.

	Pick up kqueue locking from John-Mark Gurney (woo hoo!).  This
	  removes the requirement for Giant in the high level close() system
	  call code.  Still required in closef() for VFS, however.

	Introduce additional UNP_UNLOCK_ASSERT() calls to check that the UNIX
	  domain socket subsystem lock is released after unp_detach(), as a
	  substitute for NB's that it will be.

	Annotate some potential additional potential races in the UNIX domain
	  socket code.

	Merge UNIX domain socket locking description to FreeBSD CVS.

	Introduce MP_WATCHDOG, which dedicates a CPU in an SMP system to act
	  as a system watchdog, substituting for a lack of an NMI button on
	  systems that don't have one.

	Merge MP_WATCHDOG to FreeBSD CVS.

20040814:
	Integrate to CVS HEAD.

	Reformulation of UNIX domain socket locking to make sure that the UNP
	  subsystem lock covers checks of so_pcb pointers, which will prevent
	  a variety of races.  Also, introduce additional tests to check for
	  races between close() and connect() which are present even in
	  non-mpsafe network stacks, and triggered by recent sched_ule
	  changes.  Annotate additional race possibilities.  Query: should we
	  be modifying reference handling and locking at the file descriptor
	  layer to prevent a close() from finishing until all system calls
	  outstanding on the file descriptor complete?  Don't hold the UNP
	  subsystem lock over the un_gc() garbage collection pieces.

	Merged IFF_NEEDSGIANT flagging for many common non-MPSAFE network
	  interface drivers to FreeBSD CVS.

	David Malone fixed a locking bug involving syncache access on IPv6.

	Merged move to non-blocking mbuf allocation for IPv6 raw socket sends
	  to FreeBSD CVS.

	In if_dc, rely on debug_mpsafenet to determine if we should run
	  MPSAFE rather than the IS_MPSAFE flag in the driver.  Note: this
	  requires much testing.

	In if_pcn, mark the interrupt handler as MPSAFE since the driver
	  appears to be locked.  Note: this requires much testing.

20040811:
	Integrate to CVS HEAD.

	Merged IFF_NEEDSGIANT for if_fwe to FreeBSD CVS.

	Merged lockless read of entropy harvest fifo count to FreeBSD CVS.

	Merged IFF_NEEDSGIANT for USB network interfaces to FreeBSD CVS.

	Merged splnet()->locking reference in comment in sosend() merged to
	  FreeBSD CVS.

	Merged inpcbinfo and inpcb locking assertions for in_pcbconnect() and
	  in_pcbconnect_setup() to FreeBSD CVS.

	Merged udp_send() fix to free control mbufs on the so_pcb pointer
	  being NULL to FreeBSD CVS.

	Reconstituted udp_output() to try to avoid locking the udbinfo
	  structure when no rebinding is performed.  Even when the lock is
	  acquired, try to reduce the time it is held for.  Locking is pushed
	  down from udp_send() so that more information is available to make
	  the locking decision.

	In udp_output(), include experimental code to provide additional mbuf
	  storage on the front end of the user data to provide room for a
	  link layer header to try to avoid additional mbuf allocation for
	  the ethernet header on send.

	Merge raw_ip6 send M_PREPEND() change to M_DONTWAIT to avoid sleeping
	  while holding the raw pcb mutex.  Include necessary error handling.

20040810:
	Integrate to CVS HEAD.

	Merged ADAPTIVE_GIANT in GENERIC to FreeBSD CVS; results in 30%
	  improvement in MySQL benchmarks with SMP w/o debug.mpsafenet, and
	  6%+ improvement with debug.mpsafenet.  Scott Long reports 16%
	  performance improvement on buildworld on SMP.

	Merged KTR system call tracing for i386 to FreeBSD CVS.

	Merged Giant pushdown in fcntl() to FreeBSD CVS.  MA_NOTOWNED
	  assertions disabled by default due to use of Giant by Linux ABI
	  wrapper.

	Removed use of atomic operations in the mbuf allocator for statistics.

	Merged narrowing of scope of uidinfo locking in sbchgsize() to
	  FreeBSD CVS.

	Merged extension of KTR_PROC tracing in mi_switch() to FreeBSD CVS.

	Merged addition of KTR_CALLOUT and callout tracing to FreeBSD CVS.

	Started adding comments and annotation of AIO structures in
	  preparation for starting locking for AIO.

	Merged GIANT_REQUIRED assertions for VFS operations, push-down of
	  Giant in some VFS operations to FreeBSD CVS.

	Removed use of task queue in SLIP in if_start.

	Merged removal of GIANT_REQUIRED in netatalk to FreeBSD CVS.

	Began to push down inpcb references into various socket option
	  processing routines, including ip_pcbopts(), ip_setmoptions(),
	  ip_getmoptions().  This will permit these routines to acquire inpcb
	  locks when needed, whereas current acquisition of the locks in the
	  calling code will result in holding a mutex over potentially
	  sleeping copyin and memory allocation routines.  Annotate need for
	  locking in these routines.

	Merge KTR_UMA and basic UMA allocation and free tracing to FreeBSD
	  CVS.

20040806:
	Integrate to CVS HEAD.

	Convert TIMEOUT_SAMPLING callout/timeout tracing to using KTR, which
	  provides a much better vehicle for analysis with context.  Now uses
	  KTR_CALLOUT.

	Modify KTR tracing for thread_exit() to include more thread context
	  for analysis with mi_switch() and fork_exit().

	Additional inpcb/inpcbinfo locking assertions for the inpcb connect
	  code.

	Less inpcb locking for retriving local and peer addresses from an
	  inpcb.  This could use refinement.

	Merged in6_pcbnotify() cleanup and fixes to FreeBSD CVS.

	Remove INP_LOCK_ASSERT() from tcp_time_2msl_stop(), as it's called
	  after the inpcb is disconnected from the time wait state (and
	  resulted in a NULL pointer dereference).

	Merged UDP broadcast/multicast receive locking optimization to
	  FreeBSD CVS.

20040805:
	Integrate to CVS HEAD.

	Perform a lockless read when harvesting entropy to check that the
	  entropy fifo is not full, avoiding the mutex cost if it is.  Reduce
	  the size of the harvesting fifo experimentally.

	Add KTR tracing for system calls on i386.  Similar changes are needed
	  for non-i386.

	Merged GIANT_REQUIRED in fdfree(), setugidsafety(), fdcheckstd(),
	  _fgetvp(), conditional assertion in fgetsock() to FreeBSD CVS.

	Merged spl() removal from chsbsize() to FreeBSD CVS.

	Merged lockless reads of bif_dlist in BPF packet tap to FreeBSD CVS
	  to avoid BPF locking cost if there are no listeners.

	Don't harvest entropy in ether_input(); the current harvesting is a
	  bug (and costly, due to mutex operations).

	Merged inpcb locking assertions in the presence of IPv6 to FreeBSD
	  CVS.

	Pass inpcbinfo to in6_pcbnotify() rather than inpcbhead, as it needs
	  to iterate the list of pcb's, requiring it to hold the info lock,
	  as well as acquire inpcb locks before notifying them of events.
	  Update various consumers, including UDP, TCP, and raw IPv6.

	Assert TCP inpcbs in TCP timers relating to TIME_WAIT, as they are
	  required.  Annotate possible locking problem of the global time
	  wait list.

	Don't acquire inpcb locks for UDP pcb's when searching for
	  potentially matching broadcast/multicast sockets.  Acquire the
	  inpcb mutex only when we've found a potential match.  This avoid
	  120+ mutex operations per broadcast packet received in my local
	  configuration (ouch!).

	Merge uidinfo locking key to FreeBSD CVS.

	Add rudimentary UMA KTR tracing.

20040802:
	Integrate to CVS HEAD.

	Giant becomes optional for a number of fcntl() operations.  Still
	  held over fo_ioctl().

	Documentation of uidinfo locking strategy.  Slight optimization by
	  reducing lock coverage.  Spl removal.

	Trimming of possibly unnecessary sched_lock lockage in sched_4bsd
	  userret().

	Return path simplification in ioctl() relating to Giant dropping.

	Merged accept filter registration locking to FreeBSD CVS.

	Merged IFF_NEEDSGIANT to FreeBSD CVS; tweaks to individual device
	  drivers not merged, however.

	Merged IPv6 in6pcb lock to FreeBSD CVS.
For the rest of the news, take a look at: http://www.watson.org/~robert/freebsd/netperf/





This space intentionally left blank





















Google
Web daemonnews.org

More Articles
  • Interview with Jan Schaumann
  • Interview with Theo de Raadt
  • Book Review: Virtualization with VMware ESX Server
  • Editorial: Not Quite Dead Yet
  • The Design of OpenBGPd
  • Interview with der Mouse
  • Letter to Steve Jobs
  • Interview with Manuel Bouyer on Xen
  • Apple and Open Source
  • BSDCan 2006
  • BSD Certification Survey Results
  • Lab in a Box
  • Ike Notes on BSDCan 2005
  • BSDCan 2005 Photos
  • FreeBSD Developer Summit Pictures

  • Advertisements




    Author maintains all copyrights on this article.
    Images and layout Copyright © 1998-2006 Dæmon News. All Rights Reserved.