Skip to content

Commit

Permalink
dlm: fixes for nodir mode
Browse files Browse the repository at this point in the history
The "nodir" mode (statically assign master nodes instead
of using the resource directory) has always been highly
experimental, and never seriously used.  This commit
fixes a number of problems, making nodir much more usable.

- Major change to recovery: recover all locks and restart
  all in-progress operations after recovery.  In some
  cases it's not possible to know which in-progess locks
  to recover, so recover all.  (Most require recovery
  in nodir mode anyway since rehashing changes most
  master nodes.)

- Change the way nodir mode is enabled, from a command
  line mount arg passed through gfs2, into a sysfs
  file managed by dlm_controld, consistent with the
  other config settings.

- Allow recovering MSTCPY locks on an rsb that has not
  yet been turned into a master copy.

- Ignore RCOM_LOCK and RCOM_LOCK_REPLY recovery messages
  from a previous, aborted recovery cycle.  Base this
  on the local recovery status not being in the state
  where any nodes should be sending LOCK messages for the
  current recovery cycle.

- Hold rsb lock around dlm_purge_mstcpy_locks() because it
  may run concurrently with dlm_recover_master_copy().

- Maintain highbast on process-copy lkb's (in addition to
  the master as is usual), because the lkb can switch
  back and forth between being a master and being a
  process copy as the master node changes in recovery.

- When recovering MSTCPY locks, flag rsb's that have
  non-empty convert or waiting queues for granting
  at the end of recovery.  (Rename flag from LOCKS_PURGED
  to RECOVER_GRANT and similar for the recovery function,
  because it's not only resources with purged locks
  that need grant a grant attempt.)

- Replace a couple of unnecessary assertion panics with
  error messages.

Signed-off-by: David Teigland <teigland@redhat.com>
  • Loading branch information
David Teigland committed May 2, 2012
1 parent 6d40c4a commit 4875647
Show file tree
Hide file tree
Showing 13 changed files with 304 additions and 172 deletions.
3 changes: 2 additions & 1 deletion fs/dlm/ast.c
Original file line number Diff line number Diff line change
Expand Up @@ -310,6 +310,7 @@ void dlm_callback_resume(struct dlm_ls *ls)
}
mutex_unlock(&ls->ls_cb_mutex);

log_debug(ls, "dlm_callback_resume %d", count);
if (count)
log_debug(ls, "dlm_callback_resume %d", count);
}

8 changes: 6 additions & 2 deletions fs/dlm/dlm_internal.h
Original file line number Diff line number Diff line change
Expand Up @@ -271,6 +271,8 @@ struct dlm_lkb {
ktime_t lkb_last_cast_time; /* for debugging */
ktime_t lkb_last_bast_time; /* for debugging */

uint64_t lkb_recover_seq; /* from ls_recover_seq */

char *lkb_lvbptr;
struct dlm_lksb *lkb_lksb; /* caller's status block */
void (*lkb_astfn) (void *astparam);
Expand Down Expand Up @@ -325,7 +327,7 @@ enum rsb_flags {
RSB_NEW_MASTER,
RSB_NEW_MASTER2,
RSB_RECOVER_CONVERT,
RSB_LOCKS_PURGED,
RSB_RECOVER_GRANT,
};

static inline void rsb_set_flag(struct dlm_rsb *r, enum rsb_flags flag)
Expand Down Expand Up @@ -571,6 +573,7 @@ struct dlm_ls {
struct mutex ls_requestqueue_mutex;
struct dlm_rcom *ls_recover_buf;
int ls_recover_nodeid; /* for debugging */
unsigned int ls_recover_locks_in; /* for log info */
uint64_t ls_rcom_seq;
spinlock_t ls_rcom_spin;
struct list_head ls_recover_list;
Expand All @@ -597,6 +600,7 @@ struct dlm_ls {
#define LSFL_UEVENT_WAIT 5
#define LSFL_TIMEWARN 6
#define LSFL_CB_DELAY 7
#define LSFL_NODIR 8

/* much of this is just saving user space pointers associated with the
lock that we pass back to the user lib with an ast */
Expand Down Expand Up @@ -644,7 +648,7 @@ static inline int dlm_recovery_stopped(struct dlm_ls *ls)

static inline int dlm_no_directory(struct dlm_ls *ls)
{
return (ls->ls_exflags & DLM_LSFL_NODIR) ? 1 : 0;
return test_bit(LSFL_NODIR, &ls->ls_flags);
}

int dlm_netlink_init(void);
Expand Down
Loading

0 comments on commit 4875647

Please sign in to comment.