Commit graph

39,043 commits

Author SHA1 Message Date
Joe Eykholt
24f089e2f2 [SCSI] libfc: add fc_fill_reply_hdr() and fc_fill_hdr()
Add functions to fill in an FC header given a request header.
These reduces code lines in fc_lport and fc_rport and works
without an exchange/sequence assigned.

fc_fill_reply_hdr() fills a header for a final reply frame.

fc_fill_hdr() which is similar but allows specifying the
f_ctl parameter.

Add defines for F_CTL values FC_FCTL_REQ and FC_FCTL_RESP.
These can be used for most request and response sequences.

v2 of patch adds a line to copy the frame encapsulation
info from the received frame.

Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-07-28 09:06:00 -05:00
Joe Eykholt
251748a99e [SCSI] libfc: add fc_frame_sid() and fc_frame_did() functions
To pave the way for eliminating exchanges from incoming requests,
add simple inline fc_frame_sid() and fc_frame_did() functions
which get the FC_IDs from the frame header.  This can be almost
as efficient as getting them from the sequence/exchange.

Move ntohll, htonll, ntoh24 and hton24 to <scsi/fc_frame.h>
since we need them there and that's included by <scsi/libfc.h>

Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-07-28 09:05:59 -05:00
Joe Eykholt
079ecd8cfe [SCSI] libfc: eliminate rport LOGO state
The LOGO state hasn't been used in a while, except in a brief
transition to DELETE state while holding the rport mutex.
All port LOGO responses have been ignored as well as any timeout
if we don't get a response.

So this patch just removes LOGO state and simplifies the response handler.

Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-07-28 09:05:58 -05:00
Joe Eykholt
e10f8c667b [SCSI] libfcoe: fcoe: fnic: add FIP VN2VN point-to-multipoint support
The FC-BB-6 committee is proposing a new FIP usage model called
VN_port to VN_port mode.  It allows VN_ports to discover each other
over a loss-free L2 Ethernet without any FCF or Fibre-channel fabric
services.  This is point-to-multipoint.  There is also a variant
of this called point-to-point which provides for making sure there
is just one pair of ports operating over the Ethernet fabric.

We add these new states:  VNMP_START, _PROBE1, _PROBE2, _CLAIM, and _UP.
These usually go quickly in that sequence.  After waiting a random
amount of time up to 100 ms in START, we select a pseudo-random
proposed locally-unique port ID and send out probes in states PROBE1
and PROBE2, 100 ms apart.  If no probe responses are heard, we
proceed to CLAIM state 400 ms later and send a claim notification.
We wait another 400 ms to receive claim responses, which give us
a list of the other nodes on the network, including their FC-4
capabilities.  After another 400 ms we go to VNMP_UP state and
should start interoperating with any of the nodes for whic we
receivec claim responses.  More details are in the spec.j

Add the new mode as FIP_MODE_VN2VN.  The driver must specify
explicitly that it wants to operate in this mode.  There is
no automatic detection between point-to-multipoint and fabric
mode, and the local port initialization is affected, so it isn't
anticipated that there will ever be any such automatic switchover.

It may eventually be possible to have both fabric and VN2VN
modes on the same L2 network, which may be done by two separate
local VN_ports (lports).

When in VN2VN mode, FIP replaces libfc's fabric-oriented discovery
module with its own simple code that adds remote ports as they
are discovered from incoming claim notifications and responses.
These hooks are placed by fcoe_disc_init().

A linear list of discovered vn_ports is maintained under the
fcoe_ctlr struct.  It is expected to be short for now, and
accessed infrequently.  It is kept under RCU for lock-ordering
reasons.  The lport and/or rport mutexes may be held when we
need to lookup a fcoe_vnport during an ELS send.

Change fcoe_ctlr_encaps() to lookup the destination vn_port in
the list of peers for the destination MAC address of the
FIP-encapsulated frame.

Add a new function fcoe_disc_init() to initialize just the
discovery portion of libfcoe for VN2VN mode.

Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-07-28 09:05:56 -05:00
Joe Eykholt
edcbb4395e [SCSI] libfcoe: add protocol description of FIP VN2VN mode
The FC-BB-6 committee is proposing a new FIP usage model called
VN_port to VN_port mode.  It allows VN_ports to discover each other
over a loss-free L2 Ethernet without any FCF or Fibre-channel fabric
services.  This is point-to-multipoint.  There is also a variant
of this called point-to-point which provides for making sure there
is just one pair of ports operating over the Ethernet fabric.

This patch defines the new message type and subtypes as well as
one new descriptor type used by VN2VN mode.

These are all still at the proposed stage and subject to change.

Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-07-28 09:05:55 -05:00
Joe Eykholt
f60e12e9c7 [SCSI] libfc: track FIP exchanges
When an exchange is received with a FIP encapsulation, we need
to know that the response must be sent via FIP and what the original
ELS opcode was.  This becomes important for VN2VN mode, where we may
receive FLOGI or LOGO from several peer VN_ports, and the LS_ACC or
LS_RJT must be sent FIP-encapsulated with the correct sub-type.

Add a field to the struct fc_frame, fr_encaps, to indicate the
encapsulation values.  That term is chosen to be neutral and
LLD-agnostic in case non-FCoE/FIP LLDs might find it useful.

The frame fr_encaps is transferred from the ingress frame to the
exchange by fc_exch_recv_req(), and back to the outgoing frame
by fc_seq_send().

This is taking the last byte in the skb->cb array.  If needed,
we could combine the info in sof, eof, flags, and encaps
together into one field, but it'd be better to do that if
and when its needed.

Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-07-28 09:05:54 -05:00
Joe Eykholt
a7b12a279f [SCSI] libfc: add FLOGI state to rport for VN2VN
The FIP proposal for VN_port to VN_port point-to-multipoint
operation requires a FLOGI be sent to each remote port.
The FLOGI is sent with the assigned S_ID and D_IDs of the
local and remote ports.  This and the response get
FIP-encapsulated for Ethernet.

Add FLOGI state to the remote port state machine.
This will be skipped if not in point-to-multipoint mode.

To reduce a little duplication between PLOGI and FLOGI
response handling, added fc_rport_login_complete(), which
handles the parameters for the rdata struct.

Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-07-28 09:05:53 -05:00
Joe Eykholt
3726f3584e [SCSI] libfc: Add local port point-to-multipoint flag
For VN_port to VN_port mode, the transport sets the port_id and
there's no lport FLOGI.  This is similar to FC loop mode.

Add a point_to_multipoint flag that indicates the local port is in
point-to-multipoint mode.  This skips FLOGI and discovery.
It also skips resetting the port_id on resets other than link down.

Add function fc_lport_set_local_id() that sets the local port_id.
This is called by libfcoe on behalf of the low-level driver
to set the port_id when the link comes up.

Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-07-28 09:05:53 -05:00
Joe Eykholt
3d902ac09a [SCSI] libfcoe: fcoe: fnic: change fcoe_ctlr_init interface to specify mode
There are three modes that libfcoe currently supports, and a new one
is coming.  Change the fcoe_ctlr_init() interface to add the mode
desired.  This should not change any functionality.

Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-07-28 09:05:52 -05:00
Joe Eykholt
0685230c59 [SCSI] libfc: add discovery-private pointer for LLD
For VN_port to VN_port mode, FIP will do discovery and needs a
way to find its state from the local port or discovery structure.
It seems that any other LLD that implements its own discovery
would also need something like this.

Replace disc->lport with disc->priv, and use container_of to
find the lport.  We could use disc->priv for that, but
container_of is smaller and faster.

Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-07-28 09:05:52 -05:00
Joe Eykholt
fdb068c6cd [SCSI] libfcoe: convert FIP to lock with mutex instead of spin lock
It turns out most of the FIP work is now done from worker threads
or process context now, so there's no need to use a spin lock.

Change to use mutex instead of spin lock and delayed_work instead
of a timer.

This will make it nicer for the VN_port to VN_port feature that
will interact more with the libfc layers requiring that
spinlocks not be held.

Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-07-28 09:05:51 -05:00
Joe Eykholt
f90377abca [SCSI] libfc: provide space for LLD after remote port structure
Add pre-zeroed space after the allocation for fc_rport_priv
for use by the lower-level driver.

This is primarily for VN2VN FIP mode, but could be used in
other ways someday.

The space required is specified in lport->rport_priv_size.

Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-07-28 09:05:49 -05:00
Joe Eykholt
42e9041467 [SCSI] libfc: convert rport lookup to be RCU safe
To allow LLD to do lookups on rports without grabbing a mutex,
make them RCU-safe.  The caller of lport->tt.rport_lookup will
have the choice of holding disc_mutex or the rcu_read_lock().

Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-07-28 09:05:48 -05:00
Vasu Dev
519e5135e2 [SCSI] fcoe: adds src and dest mac address checking for fcoe frames
This is  per FC-BB-5 Annex-D recommendation and per that
if address checking fails then drop the frame.

FIP code paths are already doing this so only needed for fcoe
frames.

The src address checking is limited to only fip mode since
this might break non-fip mode used in p2p due to used OUI
based addressing in some p2p code paths, going forward FIP
will be the only mode, therefore limited this to only FIP
mode so that it won't break non-fip p2p mode for now.

-v2
Removes FCOE packet type checking since fcoe_rcv is
registered to receive only FCoE type packets from netdev
and it is already checked by netdev.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-07-28 09:05:47 -05:00
Bart Van Assche
d058fd31c7 [SCSI] fcoe: make it possible to verify fcoe with sparse
Analyzing fcoe with sparse currently fails. This is because struct
fcoe_rcv_info contains two enum members that have been declared with
__attribute__((packed)). Apparently gcc honors this attribute while sparse
ignores it. The result is that sizeof(struct fcoe_rcv_info)
== sizeof(struct sk_buff::cb) == 48 on a 64-bit system according to gcc, but
not according to sparse. The patch below modifies the definition of
struct fcoe_rcv_info such that gcc and sparse interpret this structure
definition in the same way. The current sparse output is as follows:

$ cd linux-2.6.34
$ make C=2 M=drivers/scsi/fcoe modules
 CHECK   drivers/scsi/fcoe/fcoe.c

include/scsi/fc_frame.h:81:9: error: invalid bitfield width, -1.
 CC [M]  drivers/scsi/fcoe/fcoe.o
 CHECK   drivers/scsi/fcoe/libfcoe.c

include/scsi/fc_frame.h:81:9: error: invalid bitfield width, -1.
drivers/scsi/fcoe/libfcoe.c:56:37: error: invalid initializer

Signed-off-by: Bart Van Assche <bart.vanassche@gmail.com>
Cc: jeykholt@cisco.com
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-07-28 09:05:44 -05:00
Vikas Chaudhary
3b2bef1fc8 [SCSI] iscsi_transport: added new iscsi_param to display target alias in sysfs
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Ravi Anand <ravi.anand@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-07-28 09:05:25 -05:00
Eric Paris
08ae89380a fanotify: drop the useless priority argument
The priority argument in fanotify is useless.  Kill it.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:59:03 -04:00
Eric Paris
fb1cfb88c8 fsnotify: initialize mask in fsnotify_perm
akpm got a warning the fsnotify_mask could be used uninitialized in
fsnotify_perm().  It's not actually possible but his compiler complained
about it.  This patch just initializes it to 0 to shut up the compiler.

Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:59:02 -04:00
Eric Paris
b2d879096a fanotify: userspace interface for permission responses
fanotify groups need to respond to events which include permissions types.
To do so groups will send a response using write() on the fanotify_fd they
have open.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:59:02 -04:00
Eric Paris
9e66e4233d fanotify: permissions and blocking
This is the backend work needed for fanotify to support the new
FS_OPEN_PERM and FS_ACCESS_PERM fsnotify events.  This is done using the
new fsnotify secondary queue.  No userspace interface is provided actually
respond to or request these events.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:59:02 -04:00
Eric Paris
c4ec54b40d fsnotify: new fsnotify hooks and events types for access decisions
introduce a new fsnotify hook, fsnotify_perm(), which is called from the
security code.  This hook is used to allow fsnotify groups to make access
control decisions about events on the system.  We also must change the
generic fsnotify function to return an error code if we intend these hooks
to be in any way useful.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:59:01 -04:00
Dave Young
d14f172948 sysctl extern cleanup: inotify
Extern declarations in sysctl.c should be move to their own head file, and
then include them in relavant .c files.

Move inotify_table extern declaration to linux/inotify.h

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:59:01 -04:00
Alexey Dobriyan
6e006701cc dnotify: move dir_notify_enable declaration
Move dir_notify_enable declaration to where it belongs -- dnotify.h .

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:59:01 -04:00
Eric Paris
59b0df211b fsnotify: use unsigned char * for dentry->d_name.name
fsnotify was using char * when it passed around the d_name.name string
internally but it is actually an unsigned char *.  This patch switches
fsnotify to use unsigned and should silence some pointer signess warnings
which have popped out of xfs.  I do not add -Wpointer-sign to the fsnotify
code as there are still issues with kstrdup and strlen which would pop
out needless warnings.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:59:01 -04:00
Eric Paris
6e5f77b32e fsnotify: intoduce a notification merge argument
Each group can define their own notification (and secondary_q) merge
function.  Inotify does tail drop, fanotify does matching and drop which
can actually allocate a completely new event.  But for fanotify to properly
deal with permissions events it needs to know the new event which was
ultimately added to the notification queue.  This patch just implements a
void ** argument which is passed to the merge function.  fanotify can use
this field to pass the new event back to higher layers.

Signed-off-by: Eric Paris <eparis@redhat.com>
for fanotify to properly deal with permissions events
2010-07-28 09:59:01 -04:00
Eric Paris
cb2d429faf fsnotify: add group priorities
This introduces an ordering to fsnotify groups.  With purely asynchronous
notification based "things" implementing fsnotify (inotify, dnotify) ordering
isn't particularly important.  But if people want to use fsnotify for the
basis of sycronous notification or blocking notification ordering becomes
important.

eg. A Hierarchical Storage Management listener would need to get its event
before an AV scanner could get its event (since the HSM would need to
bring the data in for the AV scanner to scan.)  Typically asynchronous notification
would want to run after the AV scanner made any relevant access decisions
so as to not send notification about an event that was denied.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:59:01 -04:00
Eric Paris
4d92604cc9 fanotify: clear all fanotify marks
fanotify listeners may want to clear all marks.  They may want to do this
to destroy all of their inode marks which have nothing but ignores.
Realistically this is useful for av vendors who update policy and want to
clear all of their cached allows.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:59:00 -04:00
Eric Paris
c9778a98e7 fanotify: allow ignored_masks to survive modify
Some users may want to truely ignore an inode even if it has been modified.
Say you are wanting a mount which contains a log file and you really don't
want any notification about that file.  This patch allows the listener to
do that.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:59:00 -04:00
Eric Paris
c908370fc1 fsnotify: allow ignored_mask to survive modification
Some inodes a group may want to never hear about a set of events even if
the inode is modified.  We add a new mark flag which indicates that these
marks should not have their ignored_mask cleared on modification.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:59:00 -04:00
Eric Paris
b9e4e3bd04 fanotify: allow users to set an ignored_mask
Change the sys_fanotify_mark() system call so users can set ignored_masks
on inodes.  Remember, if a user new sets a real mask, and only sets ignored
masks, the ignore will never be pinned in memory.  Thus ignored_masks can
be lost under memory pressure and the user may again get events they
previously thought were ignored.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:59:00 -04:00
Eric Paris
33af5e32e0 fsnotify: ignored_mask - excluding notification
The ignored_mask is a new mask which is part of fsnotify marks.  A group's
should_send_event() function can use the ignored mask to determine that
certain events are not of interest.  In particular if a group registers a
mask including FS_OPEN on a vfsmount they could add FS_OPEN to the
ignored_mask for individual inodes and not send open events for those
inodes.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:59:00 -04:00
Eric Paris
90b1e7a578 fsnotify: allow marks to not pin inodes in core
inotify marks must pin inodes in core.  dnotify doesn't technically need to
since they are closed when the directory is closed.  fanotify also need to
pin inodes in core as it works today.  But the next step is to introduce
the concept of 'ignored masks' which is actually a mask of events for an
inode of no interest.  I claim that these should be liberally sent to the
kernel and should not pin the inode in core.  If the inode is brought back
in the listener will get an event it may have thought excluded, but this is
not a serious situation and one any listener should deal with.

This patch lays the ground work for non-pinning inode marks by using lazy
inode pinning.  We do not pin a mark until it has a non-zero mask entry.  If a
listener new sets a mask we never pin the inode.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:59 -04:00
Andreas Gruenbacher
88380fe66e fanotify: remove fanotify.h declarations
fanotify_mark_validate functions are all needlessly declared in headers as
static inlines.  Instead just do the checks where they are needed for code
readability.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:59 -04:00
Andreas Gruenbacher
eac8e9e80c fanotify: rename FAN_MARK_ON_VFSMOUNT to FAN_MARK_MOUNT
the term 'vfsmount' isn't sensicle to userspace.  instead call is 'mount.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:59 -04:00
Eric Paris
0ff21db9fc fanotify: hooks the fanotify_mark syscall to the vfsmount code
Create a new fanotify_mark flag which indicates we should attach the mark
to the vfsmount holding the object referenced by dfd and pathname rather
than the inode itself.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:59 -04:00
Eric Paris
1c529063a3 fanotify: should_send_event needs to handle vfsmounts
currently should_send_event in fanotify only cares about marks on inodes.
This patch extends that interface to indicate that it cares about events
that happened on vfsmounts.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:57 -04:00
Andreas Gruenbacher
ca9c726eea fsnotify: Infrastructure for per-mount watches
Per-mount watches allow groups to listen to fsnotify events on an entire
mount.  This patch simply adds and initializes the fields needed in the
vfsmount struct to make this happen.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:57 -04:00
Eric Paris
0d48b7f01f fsnotify: vfsmount marks generic functions
Much like inode-mark.c has all of the code dealing with marks on inodes
this patch adds a vfsmount-mark.c which has similar code but is intended
for marks on vfsmounts.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:57 -04:00
Andreas Gruenbacher
2504c5d63b fsnotify/vfsmount: add fsnotify fields to struct vfsmount
This patch adds the list and mask fields needed to support vfsmount marks.
These are the same fields fsnotify needs on an inode.  They are not used,
just declared and we note where the cleanup hook should be (the function is
not yet defined)

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:57 -04:00
Eric Paris
5444e2981c fsnotify: split generic and inode specific mark code
currently all marking is done by functions in inode-mark.c.  Some of this
is pretty generic and should be instead done in a generic function and we
should only put the inode specific code in inode-mark.c

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:57 -04:00
Andreas Gruenbacher
32c3263221 fanotify: Add pids to events
Pass the process identifiers of the triggering processes to fanotify
listeners: this information is useful for event filtering and logging.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:56 -04:00
Eric Paris
a1014f1023 fanotify: send events using read
Send events to userspace by reading the file descriptor from fanotify_init().
One will get blocks of data which look like:

struct fanotify_event_metadata {
	__u32 event_len;
	__u32 vers;
	__s32 fd;
	__u64 mask;
	__s64 pid;
	__u64 cookie;
} __attribute__ ((packed));

Simple code to retrieve and deal with events is below

	while ((len = read(fan_fd, buf, sizeof(buf))) > 0) {
		struct fanotify_event_metadata *metadata;

		metadata = (void *)buf;
		while(FAN_EVENT_OK(metadata, len)) {
			[PROCESS HERE!!]
			if (metadata->fd >= 0 && close(metadata->fd) != 0)
				goto fail;
			metadata = FAN_EVENT_NEXT(metadata, len);
		}
	}

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:56 -04:00
Eric Paris
2a3edf8604 fanotify: fanotify_mark syscall implementation
NAME
	fanotify_mark - add, remove, or modify an fanotify mark on a
filesystem object

SYNOPSIS
	int fanotify_mark(int fanotify_fd, unsigned int flags, u64 mask,
			  int dfd, const char *pathname)

DESCRIPTION
	fanotify_mark() is used to add remove or modify a mark on a filesystem
	object.  Marks are used to indicate that the fanotify group is
	interested in events which occur on that object.  At this point in
	time marks may only be added to files and directories.

	fanotify_fd must be a file descriptor returned by fanotify_init()

	The flags field must contain exactly one of the following:

	FAN_MARK_ADD - or the bits in mask and ignored mask into the mark
	FAN_MARK_REMOVE - bitwise remove the bits in mask and ignored mark
		from the mark

	The following values can be OR'd into the flags field:

	FAN_MARK_DONT_FOLLOW - same meaning as O_NOFOLLOW as described in open(2)
	FAN_MARK_ONLYDIR - same meaning as O_DIRECTORY as described in open(2)

	dfd may be any of the following:
	AT_FDCWD: the object will be lookup up based on pathname similar
		to open(2)

	file descriptor of a directory: if pathname is not NULL the
		object to modify will be lookup up similar to openat(2)

	file descriptor of the final object: if pathname is NULL the
		object to modify will be the object referenced by dfd

	The mask is the bitwise OR of the set of events of interest such as:
	FAN_ACCESS		- object was accessed (read)
	FAN_MODIFY		- object was modified (write)
	FAN_CLOSE_WRITE		- object was writable and was closed
	FAN_CLOSE_NOWRITE	- object was read only and was closed
	FAN_OPEN		- object was opened
	FAN_EVENT_ON_CHILD	- interested in objected that happen to
				  children.  Only relavent when the object
				  is a directory
	FAN_Q_OVERFLOW		- event queue overflowed (not implemented)

RETURN VALUE
	On success, this system call returns 0. On error, -1 is
	returned, and errno is set to indicate the error.

ERRORS
	EINVAL An invalid value was specified in flags.

	EINVAL An invalid value was specified in mask.

	EINVAL An invalid value was specified in ignored_mask.

	EINVAL fanotify_fd is not a file descriptor as returned by
	fanotify_init()

	EBADF fanotify_fd is not a valid file descriptor

	EBADF dfd is not a valid file descriptor and path is NULL.

	ENOTDIR dfd is not a directory and path is not NULL

	EACCESS no search permissions on some part of the path

	ENENT file not found

	ENOMEM Insufficient kernel memory is available.

CONFORMING TO
	These system calls are Linux-specific.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:56 -04:00
Eric Paris
bbaa4168b2 fanotify: sys_fanotify_mark declartion
This patch simply declares the new sys_fanotify_mark syscall

int fanotify_mark(int fanotify_fd, unsigned int flags, u64_mask,
		  int dfd const char *pathname)

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:55 -04:00
Eric Paris
52c923dd07 fanotify: fanotify_init syscall implementation
NAME
	fanotify_init - initialize an fanotify group

SYNOPSIS
	int fanotify_init(unsigned int flags, unsigned int event_f_flags, int priority);

DESCRIPTION
	fanotify_init() initializes a new fanotify instance and returns a file
	descriptor associated with the new fanotify event queue.

	The following values can be OR'd into the flags field:

	FAN_NONBLOCK Set the O_NONBLOCK file status flag on the new open file description.
		Using this flag saves extra calls to fcntl(2) to achieve the same
		result.

	FAN_CLOEXEC Set the close-on-exec (FD_CLOEXEC) flag on the new file descriptor.
		See the description of the O_CLOEXEC flag in open(2) for reasons why
		this may be useful.

	The event_f_flags argument is unused and must be set to 0

	The priority argument is unused and must be set to 0

RETURN VALUE
	On success, this system call return a new file descriptor. On error, -1 is
	returned, and errno is set to indicate the error.

ERRORS
	EINVAL An invalid value was specified in flags.

	EINVAL A non-zero valid was passed in event_f_flags or in priority

	ENFILE The system limit on the total number of file descriptors has been reached.

	ENOMEM Insufficient kernel memory is available.

CONFORMING TO
	These system calls are Linux-specific.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:55 -04:00
Eric Paris
11637e4b7d fanotify: fanotify_init syscall declaration
This patch defines a new syscall fanotify_init() of the form:

int sys_fanotify_init(unsigned int flags, unsigned int event_f_flags,
		      unsigned int priority)

This syscall is used to create and fanotify group.  This is very similar to
the inotify_init() syscall.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:55 -04:00
Eric Paris
ff0b16a985 fanotify: fscking all notification system
fanotify is a novel file notification system which bases notification on
giving userspace both an event type (open, close, read, write) and an open
file descriptor to the object in question.  This should address a number of
races and problems with other notification systems like inotify and dnotify
and should allow the future implementation of blocking or access controlled
notification.  These are useful for on access scanners or hierachical storage
management schemes.

This patch just implements the basics of the fsnotify functions.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:54 -04:00
Signed-off-by: Wu Fengguang
12ed2e36c9 fanotify: FMODE_NONOTIFY and __O_SYNC in sparc conflict
sparc used the same value as FMODE_NONOTIFY so change FMODE_NONOTIFY to be
something unique.

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:54 -04:00
Eric Paris
ecf081d1a7 vfs: introduce FMODE_NONOTIFY
This is a new f_mode which can only be set by the kernel.  It indicates
that the fd was opened by fanotify and should not cause future fanotify
events.  This is needed to prevent fanotify livelock.  An example of
obvious livelock is from fanotify close events.

Process A closes file1
This creates a close event for file1.
fanotify opens file1 for Listener X
Listener X deals with the event and closes its fd for file1.
This creates a close event for file1.
fanotify opens file1 for Listener X
Listener X deals with the event and closes its fd for file1.
This creates a close event for file1.
fanotify opens file1 for Listener X
Listener X deals with the event and closes its fd for file1.
notice a pattern?

The fix is to add the FMODE_NONOTIFY bit to the open filp done by the kernel
for fanotify.  Thus when that file is used it will not generate future
events.

This patch simply defines the bit.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:54 -04:00
Eric Paris
841bdc10f5 fsnotify: rename mark_entry to just mark
previously I used mark_entry when talking about marks on inodes.  The
_entry is pretty useless.  Just use "mark" instead.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:53 -04:00