sfrench/cifs-2.6.git
8 years agoxhci: init command timeout timer earlier to avoid deleting it uninitialized
Mathias Nyman [Mon, 21 Sep 2015 14:46:17 +0000 (17:46 +0300)]
xhci: init command timeout timer earlier to avoid deleting it uninitialized

commit cc8e4fc0c3b5e8340bc8358990515d116a3c274c upstream.

Don't check if timer is running with a timer_pending() before
deleting it with del_timer_sync(), this defies the whole point of
the sync part and can cause a possible race.

Instead we just want to make sure the timer is initialized early enough
before we have a chance to delete it.

Reported-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoxhci: change xhci 1.0 only restrictions to support xhci 1.1
Mathias Nyman [Mon, 21 Sep 2015 14:46:16 +0000 (17:46 +0300)]
xhci: change xhci 1.0 only restrictions to support xhci 1.1

commit dca7794539eff04b786fb6907186989e5eaaa9c2 upstream.

Some changes between xhci 0.96 and xhci 1.0 specifications forced us to
check the hci version in code, some of these checks were implemented as
hci_version == 1.0, which will not work with new xhci 1.1 controllers.

xhci 1.1 behaves similar to xhci 1.0 in these cases, so change these
checks to hci_version >= 1.0

Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agousb: xhci: exit early in xhci_setup_device() if we're halted or dying
Roger Quadros [Mon, 21 Sep 2015 14:46:15 +0000 (17:46 +0300)]
usb: xhci: exit early in xhci_setup_device() if we're halted or dying

commit 448116bfa856d3c076fa7178ed96661a008a5d45 upstream.

During quick plug/removal of OTG adapter during dual-role testing
it can happen that xhci_alloc_device() is called for the newly
detected device after the DRD library has called xhci_stop to
remove the HCD.

If that is the case, just fail early to prevent the following warning.

[  154.732649] hub 4-0:1.0: USB hub found
[  154.742204] hub 4-0:1.0: 1 port detected
[  154.824458] hub 3-0:1.0: state 7 ports 1 chg 0002 evt 0000
[  154.854609] hub 4-0:1.0: state 7 ports 1 chg 0000 evt 0000
[  154.944430] usb 3-1: new high-speed USB device number 2 using xhci-hcd
[  154.951009] xhci-hcd xhci-hcd.0.auto: xhci_setup_device
[  155.038191] xhci-hcd xhci-hcd.0.auto: remove, state 4
[  155.043315] usb usb4: USB disconnect, device number 1
[  155.055270] xhci-hcd xhci-hcd.0.auto: xhci_stop
[  155.060094] xhci-hcd xhci-hcd.0.auto: USB bus 4 deregistered
[  155.066576] xhci-hcd xhci-hcd.0.auto: remove, state 1
[  155.071710] usb usb3: USB disconnect, device number 1
[  155.077124] xhci-hcd xhci-hcd.0.auto: xhci_setup_device
[  155.082389] ------------[ cut here ]------------
[  155.087690] WARNING: CPU: 0 PID: 72 at drivers/usb/host/xhci.c:3800 xhci_setup_device+0x410/0x484 [xhci_hcd]()
[  155.097861] Modules linked in: sd_mod usb_storage scsi_mod usb_f_ss_lb g_zero libcomposite ipv6 xhci_plat_hcd xhci_hcd usbcore dwc3 udc_core evdev ti_am335x_adc joydev kfifo_buf industrialio snd_soc_simple_cc
[  155.146734] CPU: 0 PID: 72 Comm: kworker/0:3 Tainted: G        W       4.1.4-00834-gcd9380b-dirty #50
[  155.156073] Hardware name: Generic AM43 (Flattened Device Tree)
[  155.162117] Workqueue: usb_hub_wq hub_event [usbcore]
[  155.167249] Backtrace:
[  155.169751] [<c0012af0>] (dump_backtrace) from [<c0012c8c>] (show_stack+0x18/0x1c)
[  155.177390]  r6:c089d4a4 r5:ffffffff r4:00000000 r3:ee46c000
[  155.183137] [<c0012c74>] (show_stack) from [<c05f7c14>] (dump_stack+0x84/0xd0)
[  155.190446] [<c05f7b90>] (dump_stack) from [<c00439ac>] (warn_slowpath_common+0x80/0xbc)
[  155.198605]  r7:00000009 r6:00000ed8 r5:bf27eb70 r4:00000000
[  155.204348] [<c004392c>] (warn_slowpath_common) from [<c0043a0c>] (warn_slowpath_null+0x24/0x2c)
[  155.213202]  r8:ee49f000 r7:ee7c0004 r6:00000000 r5:ee7c0158 r4:ee7c0000
[  155.220051] [<c00439e8>] (warn_slowpath_null) from [<bf27eb70>] (xhci_setup_device+0x410/0x484 [xhci_hcd])
[  155.229816] [<bf27e760>] (xhci_setup_device [xhci_hcd]) from [<bf27ec10>] (xhci_address_device+0x14/0x18 [xhci_hcd])
[  155.240415]  r10:ee598200 r9:00000001 r8:00000002 r7:00000001 r6:00000003 r5:00000002
[  155.248363]  r4:ee49f000
[  155.250978] [<bf27ebfc>] (xhci_address_device [xhci_hcd]) from [<bf20cb94>] (hub_port_init+0x1b8/0xa9c [usbcore])
[  155.261403] [<bf20c9dc>] (hub_port_init [usbcore]) from [<bf2101e0>] (hub_event+0x738/0x1020 [usbcore])
[  155.270874]  r10:ee598200 r9:ee7c0000 r8:ee7c0038 r7:ee518800 r6:ee49f000 r5:00000001
[  155.278822]  r4:00000000
[  155.281426] [<bf20faa8>] (hub_event [usbcore]) from [<c005754c>] (process_one_work+0x128/0x340)
[  155.290196]  r10:00000000 r9:00000003 r8:00000000 r7:fedfa000 r6:eeec5400 r5:ee598314
[  155.298151]  r4:ee434380
[  155.300718] [<c0057424>] (process_one_work) from [<c00578f8>] (worker_thread+0x158/0x49c)
[  155.308963]  r10:ee434380 r9:00000003 r8:eeec5400 r7:00000008 r6:ee434398 r5:eeec5400
[  155.316913]  r4:eeec5414
[  155.319482] [<c00577a0>] (worker_thread) from [<c005cc40>] (kthread+0xdc/0xf8)
[  155.326765]  r10:00000000 r9:00000000 r8:00000000 r7:c00577a0 r6:ee434380 r5:ee4441c0
[  155.334713]  r4:00000000 r3:00000000
[  155.338341] [<c005cb64>] (kthread) from [<c000fc08>] (ret_from_fork+0x14/0x2c)
[  155.345626]  r7:00000000 r6:00000000 r5:c005cb64 r4:ee4441c0
[  155.356108] ---[ end trace a58d34c223b190e6 ]---
[  155.360783] xhci-hcd xhci-hcd.0.auto: Virt dev invalid for slot_id 0x1!
[  155.574404] xhci-hcd xhci-hcd.0.auto: xhci_setup_device
[  155.579667] ------------[ cut here ]------------

Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agousb: xhci: stop everything on the first call to xhci_stop
Roger Quadros [Mon, 21 Sep 2015 14:46:14 +0000 (17:46 +0300)]
usb: xhci: stop everything on the first call to xhci_stop

commit 8c24d6d7b09deee3036ddc4f2b81b53b28c8f877 upstream.

xhci_stop will be called twice, once for the shared hcd
and again for the primary hcd.

We stop the XHCI controller in any case so clean up
everything on the first call else we can timeout
waiting for pending requests to complete.

Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agousb: xhci: Clear XHCI_STATE_DYING on start
Roger Quadros [Mon, 21 Sep 2015 14:46:13 +0000 (17:46 +0300)]
usb: xhci: Clear XHCI_STATE_DYING on start

commit e5bfeab0ad515b4f6df39fe716603e9dc6d3dfd0 upstream.

For whatever reason if XHCI died in the previous instant
then it will never recover on the next xhci_start unless we
clear the DYING flag.

Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agousb: xhci: lock mutex on xhci_stop
Roger Quadros [Mon, 21 Sep 2015 14:46:12 +0000 (17:46 +0300)]
usb: xhci: lock mutex on xhci_stop

commit 85ac90f8953a58f6a057b727bc9db97721e3fb8e upstream.

Else it races with xhci_setup_device

Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoxhci: Move xhci_pme_quirk() behind #ifdef CONFIG_PM
Tomer Barletz [Mon, 21 Sep 2015 14:46:11 +0000 (17:46 +0300)]
xhci: Move xhci_pme_quirk() behind #ifdef CONFIG_PM

commit 2b7627b73e81e5d23d5ae1490fe8e690af86e053 upstream.

xhci_pme_quirk() is only used when CONFIG_PM is defined.
Compiling a kernel without PM complains about this function

[reworded commit message -Mathias]
Signed-off-by: Tomer Barletz <barletz@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoxhci: give command abortion one more chance before killing xhci
Mathias Nyman [Mon, 21 Sep 2015 14:46:10 +0000 (17:46 +0300)]
xhci: give command abortion one more chance before killing xhci

commit a6809ffd1687b3a8c192960e69add559b9d32649 upstream.

We want to give the command abortion an additional try to stop
the command ring before we completely hose xhci.

Tested-by: Vincent Pelletier <plr.vincent@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoUSB: whiteheat: fix potential null-deref at probe
Johan Hovold [Wed, 23 Sep 2015 18:41:42 +0000 (11:41 -0700)]
USB: whiteheat: fix potential null-deref at probe

commit cbb4be652d374f64661137756b8f357a1827d6a4 upstream.

Fix potential null-pointer dereference at probe by making sure that the
required endpoints are present.

The whiteheat driver assumes there are at least five pairs of bulk
endpoints, of which the final pair is used for the "command port". An
attempt to bind to an interface with fewer bulk endpoints would
currently lead to an oops.

Fixes CVE-2015-5257.

Reported-by: Moein Ghasemzadeh <moein@istuary.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/dp/mst: drop cancel work sync in the mstb destroy path (v2)
Dave Airlie [Wed, 30 Sep 2015 00:39:42 +0000 (10:39 +1000)]
drm/dp/mst: drop cancel work sync in the mstb destroy path (v2)

commit 274d83524895fe41ca8debae4eec60ede7252bb5 upstream.

Since 9eb1e57f564d4e6e10991402726cc83fe0b9172f
drm/dp/mst: make sure mst_primary mstb is valid in work function

we validate the mstb structs in the work function, and doing
that takes a reference. So we should never get here with the
work function running using the mstb device, only if the work
function hasn't run yet or is running for another mstb.

So we don't need to sync the work here, this was causing
lockdep spew as below.

[  +0.000160] =============================================
[  +0.000001] [ INFO: possible recursive locking detected ]
[  +0.000002] 3.10.0-320.el7.rhel72.stable.backport.3.x86_64.debug #1 Tainted: G        W      ------------
[  +0.000001] ---------------------------------------------
[  +0.000001] kworker/4:2/1262 is trying to acquire lock:
[  +0.000001]  ((&mgr->work)){+.+.+.}, at: [<ffffffff810b29a5>] flush_work+0x5/0x2e0
[  +0.000007]
but task is already holding lock:
[  +0.000001]  ((&mgr->work)){+.+.+.}, at: [<ffffffff810b57e4>] process_one_work+0x1b4/0x710
[  +0.000004]
other info that might help us debug this:
[  +0.000001]  Possible unsafe locking scenario:

[  +0.000002]        CPU0
[  +0.000000]        ----
[  +0.000001]   lock((&mgr->work));
[  +0.000002]   lock((&mgr->work));
[  +0.000001]
 *** DEADLOCK ***

[  +0.000001]  May be due to missing lock nesting notation

[  +0.000002] 2 locks held by kworker/4:2/1262:
[  +0.000001]  #0:  (events_long){.+.+.+}, at: [<ffffffff810b57e4>] process_one_work+0x1b4/0x710
[  +0.000004]  #1:  ((&mgr->work)){+.+.+.}, at: [<ffffffff810b57e4>] process_one_work+0x1b4/0x710
[  +0.000003]
stack backtrace:
[  +0.000003] CPU: 4 PID: 1262 Comm: kworker/4:2 Tainted: G        W      ------------   3.10.0-320.el7.rhel72.stable.backport.3.x86_64.debug #1
[  +0.000001] Hardware name: LENOVO 20EGS0R600/20EGS0R600, BIOS GNET71WW (2.19 ) 02/05/2015
[  +0.000008] Workqueue: events_long drm_dp_mst_link_probe_work [drm_kms_helper]
[  +0.000001]  ffffffff82c26c90 00000000a527b914 ffff88046399bae8 ffffffff816fe04d
[  +0.000004]  ffff88046399bb58 ffffffff8110f47f ffff880461438000 0001009b840fc003
[  +0.000002]  ffff880461438a98 0000000000000000 0000000804dc26e1 ffffffff824a2c00
[  +0.000003] Call Trace:
[  +0.000004]  [<ffffffff816fe04d>] dump_stack+0x19/0x1b
[  +0.000004]  [<ffffffff8110f47f>] __lock_acquire+0x115f/0x1250
[  +0.000002]  [<ffffffff8110fd49>] lock_acquire+0x99/0x1e0
[  +0.000002]  [<ffffffff810b29a5>] ? flush_work+0x5/0x2e0
[  +0.000002]  [<ffffffff810b29ee>] flush_work+0x4e/0x2e0
[  +0.000002]  [<ffffffff810b29a5>] ? flush_work+0x5/0x2e0
[  +0.000004]  [<ffffffff81025905>] ? native_sched_clock+0x35/0x80
[  +0.000002]  [<ffffffff81025959>] ? sched_clock+0x9/0x10
[  +0.000002]  [<ffffffff810da1f5>] ? local_clock+0x25/0x30
[  +0.000002]  [<ffffffff8110dca9>] ? mark_held_locks+0xb9/0x140
[  +0.000003]  [<ffffffff810b4ed5>] ? __cancel_work_timer+0x95/0x160
[  +0.000002]  [<ffffffff810b4ee8>] __cancel_work_timer+0xa8/0x160
[  +0.000002]  [<ffffffff810b4fb0>] cancel_work_sync+0x10/0x20
[  +0.000007]  [<ffffffffa0160d17>] drm_dp_destroy_mst_branch_device+0x27/0x120 [drm_kms_helper]
[  +0.000006]  [<ffffffffa0163968>] drm_dp_mst_link_probe_work+0x78/0xa0 [drm_kms_helper]
[  +0.000002]  [<ffffffff810b5850>] process_one_work+0x220/0x710
[  +0.000002]  [<ffffffff810b57e4>] ? process_one_work+0x1b4/0x710
[  +0.000005]  [<ffffffff810b5e5b>] worker_thread+0x11b/0x3a0
[  +0.000003]  [<ffffffff810b5d40>] ? process_one_work+0x710/0x710
[  +0.000002]  [<ffffffff810beced>] kthread+0xed/0x100
[  +0.000003]  [<ffffffff810bec00>] ? insert_kthread_work+0x80/0x80
[  +0.000003]  [<ffffffff817121d8>] ret_from_fork+0x58/0x90

v2: add flush_work.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/dp/mst: fixup handling hotplug on port removal.
Dave Airlie [Wed, 16 Sep 2015 00:37:28 +0000 (10:37 +1000)]
drm/dp/mst: fixup handling hotplug on port removal.

commit df4839fdc9b3c922586b945f062f38cbbda022bb upstream.

output ports should always have a connector, unless
in the rare case connector allocation fails in the
driver.

In this case we only need to teardown the pdt,
and free the struct, and there is no need to
send a hotplug msg.

In the case were we add the port to the destroy
list we need to send a hotplug if we destroy
any connectors, so userspace knows to reprobe
stuff.

this patch also handles port->connector allocation
failing which should be a rare event, but makes
the code consistent.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/radeon: Restore LCD backlight level on resume (>= R5xx)
Michel Dänzer [Mon, 28 Sep 2015 09:16:31 +0000 (18:16 +0900)]
drm/radeon: Restore LCD backlight level on resume (>= R5xx)

commit 4281f46ef839050d2ef60348f661eb463c21cc2e upstream.

Instead of only enabling the backlight (which seems to set it to max
brightness), just re-set the current backlight level, which also takes
care of enabling the backlight if necessary.

Only the radeon_atom_encoder_dpms_dig part tested on a Kaveri laptop,
the radeon_atom_encoder_dpms_avivo part is only compile tested.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm: Reject DRI1 hw lock ioctl functions for kms drivers
Daniel Vetter [Tue, 23 Jun 2015 09:34:21 +0000 (11:34 +0200)]
drm: Reject DRI1 hw lock ioctl functions for kms drivers

commit da168d81b44898404d281d5dbe70154ab5f117c1 upstream.

I've done some extensive history digging across libdrm, mesa and
xf86-video-{intel,nouveau,ati}. The only potential user of this with
kms drivers I could find was ttmtest, which once used drmGetLock
still. But that mistake was quickly fixed up. Even the intel xvmc
library (which otherwise was really good with using dri1 stuff in kms
mode) managed to never take the hw lock for dri2 (and hence kms).

Hence it should be save to unconditionally disallow this.

Cc: Peter Antoine <peter.antoine@intel.com>
Reviewed-by: Peter Antoine <peter.antoine@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/i915/bios: handle MIPI Sequence Block v3+ gracefully
Jani Nikula [Thu, 17 Sep 2015 13:42:07 +0000 (16:42 +0300)]
drm/i915/bios: handle MIPI Sequence Block v3+ gracefully

commit cd67d226ebd909d239d2c6e5a6abd6e2a338d1cd upstream.

The VBT MIPI Sequence Block version 3 has forward incompatible changes:

First, the block size in the header has been specified reserved, and the
actual size is a separate 32-bit value within the block. The current
find_section() function to will only look at the size in the block
header, and, depending on what's in that now reserved size field,
continue looking for other sections in the wrong place.

Fix this by taking the new block size field into account. This will
ensure that the lookups for other sections will work properly, as long
as the new 32-bit size does not go beyond the opregion VBT mailbox size.

Second, the contents of the block have been completely
changed. Gracefully refuse parsing the yet unknown data version.

Cc: Deepak M <m.deepak@intel.com>
Reviewed-by: Deepak M <m.deepak@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/amdgpu: Restore LCD backlight level on resume
Alex Deucher [Tue, 29 Sep 2015 17:53:30 +0000 (13:53 -0400)]
drm/amdgpu: Restore LCD backlight level on resume

commit 74b3112e95073b351e3b0b9799795bc76f8415fa upstream.

Instead of only enabling the backlight (which seems to set it to max
brightness), just re-set the current backlight level, which also takes
care of enabling the backlight if necessary.

Port of radeon commit:
drm/radeon: Restore LCD backlight level on resume (>= R5xx)

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/amdgpu: Fix max_vblank_count value for current display engines
Alex Deucher [Tue, 22 Sep 2015 14:06:45 +0000 (10:06 -0400)]
drm/amdgpu: Fix max_vblank_count value for current display engines

commit 5a6adfa20b622a273205e33b20c12332aa7eb724 upstream.

The value was much too low, which could cause the userspace visible
vblank counter to move backwards when the hardware counter wrapped
around.

Ported from radeon commit:
b0b9bb4dd51f396dcf843831905f729e74b0c8c0

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/amdgpu: make UVD handle checking more strict
Leo Liu [Tue, 15 Sep 2015 14:38:38 +0000 (10:38 -0400)]
drm/amdgpu: make UVD handle checking more strict

commit 5146419e6feb99cfbc8dbf005dd2f62603e15efb upstream.

Invalid messages can crash the hw otherwise

Ported from radeon commit a1b403da70e038ca6c6c6fe434d1d873546873a3

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/amdgpu: fix the UVD suspend sequence order
Leo Liu [Fri, 11 Sep 2015 18:22:18 +0000 (14:22 -0400)]
drm/amdgpu: fix the UVD suspend sequence order

commit 2bd188d0167227932be3cf5b033c0e600b01291f upstream.

Fixes suspend issues with UVD.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/amdgpu: Disable UVD PG
Leo Liu [Thu, 10 Sep 2015 17:41:38 +0000 (13:41 -0400)]
drm/amdgpu: Disable UVD PG

commit 1ee4478a26cf55c8f8a6219d7e99f2b48959394d upstream.

This causes problems with multiple suspend/resume cycles.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/amdgpu: fix overflow on 32bit systems
Christian König [Mon, 7 Sep 2015 10:32:09 +0000 (12:32 +0200)]
drm/amdgpu: fix overflow on 32bit systems

commit b7d698d7fd7d132c6ebe56d230584f2cae6c94ee upstream.

mem->start is a long, so this can overflow on 32bit systems.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/qxl: recreate the primary surface when the bo is not primary
Fabiano Fidêncio [Thu, 24 Sep 2015 13:18:34 +0000 (15:18 +0200)]
drm/qxl: recreate the primary surface when the bo is not primary

commit 8d0d94015e96b8853c4f7f06eac3f269e1b3d866 upstream.

When disabling/enabling a crtc the primary area must be updated
independently of which crtc has been disabled/enabled.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1264735

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/qxl: only report first monitor as connected if we have no state
Dave Airlie [Mon, 14 Sep 2015 00:28:34 +0000 (10:28 +1000)]
drm/qxl: only report first monitor as connected if we have no state

commit 69e5d3f893e19613486f300fd6e631810338aa4b upstream.

If the server isn't new enough to give us state, report the first
monitor as always connected, otherwise believe the server side.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoDo not fall back to SMBWriteX in set_file_size error cases
Steve French [Mon, 28 Sep 2015 22:21:07 +0000 (17:21 -0500)]
Do not fall back to SMBWriteX in set_file_size error cases

commit 646200a041203f440fb6fcf9cacd9efeda9de74c upstream.

The error paths in set_file_size for cifs and smb3 are incorrect.

In the unlikely event that a server did not support set file info
of the file size, the code incorrectly falls back to trying SMBWriteX
(note that only the original core SMB Write, used for example by DOS,
can set the file size this way - this actually  does not work for the more
recent SMBWriteX).  The idea was since the old DOS SMB Write could set
the file size if you write zero bytes at that offset then use that if
server rejects the normal set file info call.

Fortunately the SMBWriteX will never be sent on the wire (except when
file size is zero) since the length and offset fields were reversed
in the two places in this function that call SMBWriteX causing
the fall back path to return an error. It is also important to never call
an SMB request from an SMB2/sMB3 session (which theoretically would
be possible, and can cause a brief session drop, although the client
recovers) so this should be fixed.  In practice this path does not happen
with modern servers but the error fall back to SMBWriteX is clearly wrong.

Removing the calls to SMBWriteX in the error paths in cifs_set_file_size

Pointed out by PaX/grsecurity team

Signed-off-by: Steve French <steve.french@primarydata.com>
Reported-by: PaX Team <pageexec@freemail.hu>
CC: Emese Revfy <re.emese@gmail.com>
CC: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodisabling oplocks/leases via module parm enable_oplocks broken for SMB3
Steve French [Tue, 22 Sep 2015 14:29:38 +0000 (09:29 -0500)]
disabling oplocks/leases via module parm enable_oplocks broken for SMB3

commit e0ddde9d44e37fbc21ce893553094ecf1a633ab5 upstream.

leases (oplocks) were always requested for SMB2/SMB3 even when oplocks
disabled in the cifs.ko module.

Signed-off-by: Steve French <steve.french@primarydata.com>
Reviewed-by: Chandrika Srinivasan <chandrika.srinivasan@citrix.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoFix sec=krb5 on smb3 mounts
Steve French [Thu, 24 Sep 2015 05:52:37 +0000 (00:52 -0500)]
Fix sec=krb5 on smb3 mounts

commit ceb1b0b9b4d1089e9f2731a314689ae17784c861 upstream.

Kerberos, which is very important for security, was only enabled for
CIFS not SMB2/SMB3 mounts (e.g. vers=3.0)

Patch based on the information detailed in
http://thread.gmane.org/gmane.linux.kernel.cifs/10081/focus=10307
to enable Kerberized SMB2/SMB3

a) SMB2_negotiate: enable/use decode_negTokenInit in SMB2_negotiate
b) SMB2_sess_setup: handle Kerberos sectype and replicate Kerberos
   SMB1 processing done in sess_auth_kerberos

Signed-off-by: Noel Power <noel.power@suse.com>
Signed-off-by: Jim McDonough <jmcd@samba.org>
Signed-off-by: Steve French <steve.french@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoNFS: Fix a write performance regression
Trond Myklebust [Thu, 1 Oct 2015 22:38:27 +0000 (18:38 -0400)]
NFS: Fix a write performance regression

commit 8fa4592a14ebb3c22a21d846d1e4f65dab7d1a7c upstream.

If all other conditions in nfs_can_extend_write() are met, and there
are no locks, then we should be able to assume close-to-open semantics
and the ability to extend our write to cover the whole page.

With this patch, the xfstests generic/074 test completes in 242s instead
of >1400s on my test rig.

Fixes: bd61e0a9c852 ("locks: convert posix locks to file_lock_context")
Cc: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonfs: fix pg_test page count calculation
Peng Tao [Fri, 11 Sep 2015 03:14:06 +0000 (11:14 +0800)]
nfs: fix pg_test page count calculation

commit 048883e0b934d9a5103d40e209cb14b7f33d2933 upstream.

We really want sizeof(struct page *) instead. Otherwise we limit
maximum IO size to 64 pages rather than 512 pages on a 64bit system.

Fixes 2e11f829(nfs: cap request size to fit a kmalloced page array).

Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Fixes: 2e11f8296d22 ("nfs: cap request size to fit a kmalloced page array")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoNFSv4: Recovery of recalled read delegations is broken
Trond Myklebust [Sun, 20 Sep 2015 14:50:17 +0000 (10:50 -0400)]
NFSv4: Recovery of recalled read delegations is broken

commit 24311f884189d42d40354a6f38ca218eb9aeb811 upstream.

When a read delegation is being recalled, and we're reclaiming the
cached opens, we need to make sure that we only reclaim read-only
modes.
A previous attempt to do this, relied on retrieving the delegation
type from the nfs4_opendata structure. Unfortunately, as Kinglong
pointed out, this field can only be set when performing reboot recovery.

Furthermore, if we call nfs4_open_recover(), then we end up clobbering
the state->flags for all modes that we're not recovering...

The fix is to have the delegation recall code pass this information
to the recovery call, and then refactor the recovery code so that
nfs4_open_delegation_recall() does not need to call nfs4_open_recover().

Reported-by: Kinglong Mee <kinglongmee@gmail.com>
Fixes: 39f897fdbd46 ("NFSv4: When returning a delegation, don't...")
Tested-by: Kinglong Mee <kinglongmee@gmail.com>
Cc: NeilBrown <neilb@suse.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoNFS: Do cleanup before resetting pageio read/write to mds
Kinglong Mee [Sun, 20 Sep 2015 15:03:28 +0000 (23:03 +0800)]
NFS: Do cleanup before resetting pageio read/write to mds

commit 6f29b9bba7b08c6b1d6f2cc4cf750b342fc1946c upstream.

There is a reference leak of layout segment after resetting
pageio read/write to mds.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonfs: fix v4.2 SEEK on files over 2 gigs
J. Bruce Fields [Wed, 16 Sep 2015 21:21:27 +0000 (17:21 -0400)]
nfs: fix v4.2 SEEK on files over 2 gigs

commit 306a5549355966e480e0dcacdc6b9321d153e0c0 upstream.

We're incorrectly assigning a loff_t return to an int.  If SEEK_HOLE or
SEEK_DATA returns an offset over 2^31 then the application will see a
weird lseek() result (usually -EIO).

Fixes: bdcc2cd14e4e "NFSv4.2: handle NFS-specific llseek errors"
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Reviewed-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoBluetooth: Delay check for conn->smp in smp_conn_security()
Johan Hedberg [Fri, 4 Sep 2015 09:22:46 +0000 (12:22 +0300)]
Bluetooth: Delay check for conn->smp in smp_conn_security()

commit d8949aad3eab5d396f4fefcd581773bf07b9a79e upstream.

There are several actions that smp_conn_security() might make that do
not require a valid SMP context (conn->smp pointer). One of these
actions is to encrypt the link with an existing LTK. If the SMP
context wasn't initialized properly we should still allow the
independent actions to be done, i.e. the check for the context should
only be done at the last possible moment.

Reported-by: Chuck Ebbert <cebbert.lkml@gmail.com>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoregulator: core: Handle probe deferral from DT when resolving supplies
Mark Brown [Thu, 1 Oct 2015 09:59:48 +0000 (10:59 +0100)]
regulator: core: Handle probe deferral from DT when resolving supplies

commit 06423121d9eba0a56b9341cf82b88479017bce14 upstream.

When resolving regulator-regulator supplies we ignore probe deferral
returns from regulator_dev_lookup() (such as are generated for DT when
we can see a supply is registered) and just fall back to the dummy
regulator if there are full constraints (as is the case for DT).  This
means that probe deferral is broken for DT systems, fix that by paying
attention to -EPROBE_DEFER return codes like we do -ENODEV.

A further patch will simplify this further, this is a minimal fix for
the specific issue.

Fixes: 9f7e25edb1575a6d2 (regulator: core: Handle full constraints systems when resolving supplies)
Reported-by: Sascha Hauer <s.hauer@pengutronix.de>
Tested-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Mark Brown <broonnie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoregulator: axp20x: Fix enable bit indexes for DCDC4 and DCDC5
Chen-Yu Tsai [Sat, 26 Sep 2015 13:21:12 +0000 (21:21 +0800)]
regulator: axp20x: Fix enable bit indexes for DCDC4 and DCDC5

commit 6b3600b4ba0810c84437cf76556d9afbd55c1bfc upstream.

The enable bit indexes for DCDC4 and DCDC5 regulators are off by 1.

We haven't run into any problems with this since either the regulators
aren't defined in the DT and aren't used, or all the DCDC regulators
have the "always-on" property set, as they are almost always used
for system critical loads.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoregulator: core: Correct return value check in regulator_resolve_supply
Charles Keepax [Thu, 17 Sep 2015 13:50:20 +0000 (14:50 +0100)]
regulator: core: Correct return value check in regulator_resolve_supply

commit 23c3f310e897837aeb8ffe8700b803cb58e7b35d upstream.

The ret pointer passed to regulator_dev_lookup is only filled with a
valid error code if regulator_dev_lookup returned NULL. Currently
regulator_resolve_supply checks this ret value before it checks if a
regulator was returned, this can result in valid regulator lookups being
ignored.

Fixes: 6261b06de565 ("regulator: Defer lookup of supply to regulator_get")
Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonetfilter: nf_log: don't zap all loggers on unregister
Florian Westphal [Wed, 9 Sep 2015 00:57:21 +0000 (02:57 +0200)]
netfilter: nf_log: don't zap all loggers on unregister

commit 205ee117d4dc4a11ac3bd9638bb9b2e839f4de9a upstream.

like nf_log_unset, nf_log_unregister must not reset the list of loggers.
Otherwise, a call to nf_log_unregister() will render loggers of other nf
protocols unusable:

iptables -A INPUT -j LOG
modprobe nf_log_arp ; rmmod nf_log_arp
iptables -A INPUT -j LOG
iptables: No chain/target/match by that name

Fixes: 30e0c6a6be ("netfilter: nf_log: prepare net namespace support for loggers")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonetfilter: nft_compat: skip family comparison in case of NFPROTO_UNSPEC
Pablo Neira Ayuso [Mon, 14 Sep 2015 16:04:09 +0000 (18:04 +0200)]
netfilter: nft_compat: skip family comparison in case of NFPROTO_UNSPEC

commit ba378ca9c04a5fc1b2cf0f0274a9d02eb3d1bad9 upstream.

Fix lookup of existing match/target structures in the corresponding list
by skipping the family check if NFPROTO_UNSPEC is used.

This is resulting in the allocation and insertion of one match/target
structure for each use of them. So this not only bloats memory
consumption but also severely affects the time to reload the ruleset
from the iptables-compat utility.

After this patch, iptables-compat-restore and iptables-compat take
almost the same time to reload large rulesets.

Fixes: 0ca743a55991 ("netfilter: nf_tables: add compatibility layer for x_tables")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonetfilter: nf_log: wait for rcu grace after logger unregistration
Pablo Neira Ayuso [Thu, 17 Sep 2015 11:37:00 +0000 (13:37 +0200)]
netfilter: nf_log: wait for rcu grace after logger unregistration

commit ad5001cc7cdf9aaee5eb213fdee657e4a3c94776 upstream.

The nf_log_unregister() function needs to call synchronize_rcu() to make sure
that the objects are not dereferenced anymore on module removal.

Fixes: 5962815a6a56 ("netfilter: nf_log: use an array of loggers instead of list")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonetfilter: conntrack: use nf_ct_tmpl_free in CT/synproxy error paths
Daniel Borkmann [Mon, 31 Aug 2015 17:11:02 +0000 (19:11 +0200)]
netfilter: conntrack: use nf_ct_tmpl_free in CT/synproxy error paths

commit 9cf94eab8b309e8bcc78b41dd1561c75b537dd0b upstream.

Commit 0838aa7fcfcd ("netfilter: fix netns dependencies with conntrack
templates") migrated templates to the new allocator api, but forgot to
update error paths for them in CT and synproxy to use nf_ct_tmpl_free()
instead of nf_conntrack_free().

Due to that, memory is being freed into the wrong kmemcache, but also
we drop the per net reference count of ct objects causing an imbalance.

In Brad's case, this leads to a wrap-around of net->ct.count and thus
lets __nf_conntrack_alloc() refuse to create a new ct object:

  [   10.340913] xt_addrtype: ipv6 does not support BROADCAST matching
  [   10.810168] nf_conntrack: table full, dropping packet
  [   11.917416] r8169 0000:07:00.0 eth0: link up
  [   11.917438] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
  [   12.815902] nf_conntrack: table full, dropping packet
  [   15.688561] nf_conntrack: table full, dropping packet
  [   15.689365] nf_conntrack: table full, dropping packet
  [   15.690169] nf_conntrack: table full, dropping packet
  [   15.690967] nf_conntrack: table full, dropping packet
  [...]

With slab debugging, it also reports the wrong kmemcache (kmalloc-512 vs.
nf_conntrack_ffffffff81ce75c0) and reports poison overwrites, etc. Thus,
to fix the problem, export and use nf_ct_tmpl_free() instead.

Fixes: 0838aa7fcfcd ("netfilter: fix netns dependencies with conntrack templates")
Reported-by: Brad Jackson <bjackson0971@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonetfilter: ipset: Fixing unnamed union init
Elad Raz [Sat, 22 Aug 2015 05:44:11 +0000 (08:44 +0300)]
netfilter: ipset: Fixing unnamed union init

commit 96be5f2806cd65a2ebced3bfcdf7df0116e6c4a6 upstream.

In continue to proposed Vinson Lee's post [1], this patch fixes compilation
issues founded at gcc 4.4.7. The initialization of .cidr field of unnamed
unions causes compilation error in gcc 4.4.x.

References

Visible links
[1] https://lkml.org/lkml/2015/7/5/74

Signed-off-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonetfilter: ipset: Out of bound access in hash:net* types fixed
Jozsef Kadlecsik [Tue, 25 Aug 2015 09:17:51 +0000 (11:17 +0200)]
netfilter: ipset: Out of bound access in hash:net* types fixed

commit 6fe7ccfd77415a6ba250c10c580eb3f9acf79753 upstream.

Dave Jones reported that KASan detected out of bounds access in hash:net*
types:

[   23.139532] ==================================================================
[   23.146130] BUG: KASan: out of bounds access in hash_net4_add_cidr+0x1db/0x220 at addr ffff8800d4844b58
[   23.152937] Write of size 4 by task ipset/457
[   23.159742] =============================================================================
[   23.166672] BUG kmalloc-512 (Not tainted): kasan: bad access detected
[   23.173641] -----------------------------------------------------------------------------
[   23.194668] INFO: Allocated in hash_net_create+0x16a/0x470 age=7 cpu=1 pid=456
[   23.201836]  __slab_alloc.constprop.66+0x554/0x620
[   23.208994]  __kmalloc+0x2f2/0x360
[   23.216105]  hash_net_create+0x16a/0x470
[   23.223238]  ip_set_create+0x3e6/0x740
[   23.230343]  nfnetlink_rcv_msg+0x599/0x640
[   23.237454]  netlink_rcv_skb+0x14f/0x190
[   23.244533]  nfnetlink_rcv+0x3f6/0x790
[   23.251579]  netlink_unicast+0x272/0x390
[   23.258573]  netlink_sendmsg+0x5a1/0xa50
[   23.265485]  SYSC_sendto+0x1da/0x2c0
[   23.272364]  SyS_sendto+0xe/0x10
[   23.279168]  entry_SYSCALL_64_fastpath+0x12/0x6f

The bug is fixed in the patch and the testsuite is extended in ipset
to check cidr handling more thoroughly.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonetfilter: nf_tables: Use 32 bit addressing register from nft_type_to_reg()
Pablo Neira Ayuso [Wed, 12 Aug 2015 15:41:00 +0000 (17:41 +0200)]
netfilter: nf_tables: Use 32 bit addressing register from nft_type_to_reg()

commit bf798657eb5ba57552096843c315f096fdf9b715 upstream.

nft_type_to_reg() needs to return the register in the new 32 bit addressing,
otherwise we hit EINVAL when using mappings.

Fixes: 49499c3 ("netfilter: nf_tables: switch registers to 32 bit addressing")
Reported-by: Andreas Schultz <aschultz@tpip.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonetfilter: nfnetlink: work around wrong endianess in res_id field
Pablo Neira Ayuso [Fri, 28 Aug 2015 19:01:43 +0000 (21:01 +0200)]
netfilter: nfnetlink: work around wrong endianess in res_id field

commit a9de9777d613500b089a7416f936bf3ae5f070d2 upstream.

The convention in nfnetlink is to use network byte order in every header field
as well as in the attribute payload. The initial version of the batching
infrastructure assumes that res_id comes in host byte order though.

The only client of the batching infrastructure is nf_tables, so let's add a
workaround to address this inconsistency. We currently have 11 nfnetlink
subsystems according to NFNL_SUBSYS_COUNT, so we can assume that the subsystem
2560, ie. htons(10), will not be allocated anytime soon, so it can be an alias
of nf_tables from the nfnetlink batching path when interpreting the res_id
field.

Based on original patch from Florian Westphal.

Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonetfilter: bridge: fix IPv6 packets not being bridged with CONFIG_IPV6=n
Bernhard Thaler [Thu, 13 Aug 2015 06:58:15 +0000 (08:58 +0200)]
netfilter: bridge: fix IPv6 packets not being bridged with CONFIG_IPV6=n

commit 18e1db67e93ed75d9dc0d34c8d783ccf10547c2b upstream.

230ac490f7fba introduced a dependency to CONFIG_IPV6 which breaks bridging
of IPv6 packets on a bridge with CONFIG_IPV6=n.

Sysctl entry /proc/sys/net/bridge/bridge-nf-call-ip6tables defaults to 1,
for this reason packets are handled by br_nf_pre_routing_ipv6(). When compiled
with CONFIG_IPV6=n this function returns NF_DROP but should return NF_ACCEPT
to let packets through.

Change CONFIG_IPV6=n br_nf_pre_routing_ipv6() return value to NF_ACCEPT.

Tested with a simple bridge with two interfaces and IPv6 packets trying
to pass from host on left side to host on right side of the bridge.

Fixes: 230ac490f7fba ("netfilter: bridge: split ipv6 code into separated file")
Signed-off-by: Bernhard Thaler <bernhard.thaler@wvnet.at>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodm raid: fix round up of default region size
Mikulas Patocka [Fri, 2 Oct 2015 15:17:37 +0000 (11:17 -0400)]
dm raid: fix round up of default region size

commit 042745ee53a0a7c1f5aff191a4a24213c6dcfb52 upstream.

Commit 3a0f9aaee028 ("dm raid: round region_size to power of two")
intended to make sure that the default region size is a power of two.
However, the logic in that commit is incorrect and sets the variable
region_size to 0 or 1, depending on whether min_region_size is a power
of two.

Fix this logic, using roundup_pow_of_two(), so that region_size is
properly rounded up to the next power of two.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Fixes: 3a0f9aaee028 ("dm raid: round region_size to power of two")
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agomd/raid0: apply base queue limits *before* disk_stack_limits
NeilBrown [Thu, 24 Sep 2015 05:47:47 +0000 (15:47 +1000)]
md/raid0: apply base queue limits *before* disk_stack_limits

commit 66eefe5de11db1e0d8f2edc3880d50e7c36a9d43 upstream.

Calling e.g. blk_queue_max_hw_sectors() after calls to
disk_stack_limits() discards the settings determined by
disk_stack_limits().
So we need to make those calls first.

Fixes: 199dc6ed5179 ("md/raid0: update queue parameter in a safer location.")
Reported-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agomd/raid0: update queue parameter in a safer location.
NeilBrown [Mon, 3 Aug 2015 03:11:47 +0000 (13:11 +1000)]
md/raid0: update queue parameter in a safer location.

commit 199dc6ed5179251fa6158a461499c24bdd99c836 upstream.

When a (e.g.) RAID5 array is reshaped to RAID0, the updating
of queue parameters (e.g. max number of sectors per bio) is
done in the wrong place.
It should be part of ->run, but it is actually part of ->takeover.
This means it happens before level_store() calls:

blk_set_stacking_limits(&mddev->queue->limits);

and so it ineffective.  This can lead to errors from underlying
devices.

So move all the relevant settings out of create_stripe_zones()
and into raid0_run().

As this can lead to a bug-on it is suitable for any -stable
kernel which supports reshape to RAID0.  So 2.6.35 or later.
As the bug has been present for five years there is no urgency,
so no need to rush into -stable.

Fixes: 9af204cf720c ("md: Add support for Raid5->Raid0 and Raid10->Raid0 takeover")
Reported-by: Yi Zhang <yizhan@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoUSB: option: add ZTE PIDs
Liu.Zhao [Mon, 24 Aug 2015 15:36:12 +0000 (08:36 -0700)]
USB: option: add ZTE PIDs

commit 19ab6bc5674a30fdb6a2436b068d19a3c17dc73e upstream.

This is intended to add ZTE device PIDs on kernel.

Signed-off-by: Liu.Zhao <lzsos369@163.com>
[johan: sort the new entries ]
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agostaging: ion: fix corruption of ion_import_dma_buf
Shawn Lin [Wed, 9 Sep 2015 07:41:52 +0000 (15:41 +0800)]
staging: ion: fix corruption of ion_import_dma_buf

commit 6fa92e2bcf6390e64895b12761e851c452d87bd8 upstream.

we found this issue but still exit in lastest kernel. Simply
keep ion_handle_create under mutex_lock to avoid this race.

WARNING: CPU: 2 PID: 2648 at drivers/staging/android/ion/ion.c:512 ion_handle_add+0xb4/0xc0()
ion_handle_add: buffer already found.
Modules linked in: iwlmvm iwlwifi mac80211 cfg80211 compat
CPU: 2 PID: 2648 Comm: TimedEventQueue Tainted: G        W    3.14.0 #7
 00000000 00000000 9a3efd2c 80faf273 9a3efd6c 9a3efd5c 80935dc9 811d7fd3
 9a3efd88 00000a58 812208a0 00000200 80e128d4 80e128d4 8d4ae00c a8cd8600
 a8cd8094 9a3efd74 80935e0e 00000009 9a3efd6c 811d7fd3 9a3efd88 9a3efd9c
Call Trace:
  [<80faf273>] dump_stack+0x48/0x69
  [<80935dc9>] warn_slowpath_common+0x79/0x90
  [<80e128d4>] ? ion_handle_add+0xb4/0xc0
  [<80e128d4>] ? ion_handle_add+0xb4/0xc0
  [<80935e0e>] warn_slowpath_fmt+0x2e/0x30
  [<80e128d4>] ion_handle_add+0xb4/0xc0
  [<80e144cc>] ion_import_dma_buf+0x8c/0x110
  [<80c517c4>] reg_init+0x364/0x7d0
  [<80993363>] ? futex_wait+0x123/0x210
  [<80992e0e>] ? get_futex_key+0x16e/0x1e0
  [<8099308f>] ? futex_wake+0x5f/0x120
  [<80c51e19>] vpu_service_ioctl+0x1e9/0x500
  [<80994aec>] ? do_futex+0xec/0x8e0
  [<80971080>] ? prepare_to_wait_event+0xc0/0xc0
  [<80c51c30>] ? reg_init+0x7d0/0x7d0
  [<80a22562>] do_vfs_ioctl+0x2d2/0x4c0
  [<80b198ad>] ? inode_has_perm.isra.41+0x2d/0x40
  [<80b199cf>] ? file_has_perm+0x7f/0x90
  [<80b1a5f7>] ? selinux_file_ioctl+0x47/0xf0
  [<80a227a8>] SyS_ioctl+0x58/0x80
  [<80fb45e8>] syscall_call+0x7/0x7
  [<80fb0000>] ? mmc_do_calc_max_discard+0xab/0xe4

Fixes: 83271f626 ("ion: hold reference to handle...")
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Reviewed-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agosvcrdma: Fix send_reply() scatter/gather set-up
Chuck Lever [Thu, 9 Jul 2015 20:45:18 +0000 (16:45 -0400)]
svcrdma: Fix send_reply() scatter/gather set-up

commit 9d11b51ce7c150a69e761e30518f294fc73d55ff upstream.

The Linux NFS server returns garbage in the data payload of inline
NFS/RDMA READ replies. These are READs of under 1000 bytes or so
where the client has not provided either a reply chunk or a write
list.

The NFS server delivers the data payload for an NFS READ reply to
the transport in an xdr_buf page list. If the NFS client did not
provide a reply chunk or a write list, send_reply() is supposed to
set up a separate sge for the page containing the READ data, and
another sge for XDR padding if needed, then post all of the sges via
a single SEND Work Request.

The problem is send_reply() does not advance through the xdr_buf
when setting up scatter/gather entries for SEND WR. It always calls
dma_map_xdr with xdr_off set to zero. When there's more than one
sge, dma_map_xdr() sets up the SEND sge's so they all point to the
xdr_buf's head.

The current Linux NFS/RDMA client always provides a reply chunk or
a write list when performing an NFS READ over RDMA. Therefore, it
does not exercise this particular case. The Linux server has never
had to use more than one extra sge for building RPC/RDMA replies
with a Linux client.

However, an NFS/RDMA client _is_ allowed to send small NFS READs
without setting up a write list or reply chunk. The NFS READ reply
fits entirely within the inline reply buffer in this case. This is
perhaps a more efficient way of performing NFS READs that the Linux
NFS/RDMA client may some day adopt.

Fixes: b432e6b3d9c1 ('svcrdma: Change DMA mapping logic to . . .')
BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=285
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoath10k: fix dma_mapping_error() handling
Michal Kazior [Wed, 19 Aug 2015 11:10:43 +0000 (13:10 +0200)]
ath10k: fix dma_mapping_error() handling

commit 5e55e3cbd1042cffa6249f22c10585e63f8a29bf upstream.

The function returns 1 when DMA mapping fails. The
driver would return bogus values and could
possibly confuse itself if DMA failed.

Fixes: 767d34fc67af ("ath10k: remove DMA mapping wrappers")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodm crypt: constrain crypt device's max_segment_size to PAGE_SIZE
Mike Snitzer [Thu, 10 Sep 2015 01:34:51 +0000 (21:34 -0400)]
dm crypt: constrain crypt device's max_segment_size to PAGE_SIZE

commit 586b286b110e94eb31840ac5afc0c24e0881fe34 upstream.

Setting the dm-crypt device's max_segment_size to PAGE_SIZE is an
unfortunate constraint that is required to avoid the potential for
exceeding dm-crypt's underlying device's max_segments limits -- due to
crypt_alloc_buffer() possibly allocating pages for the encryption bio
that are not as physically contiguous as the original bio.

It is interesting to note that this problem was already fixed back in
2007 via commit 91e106259 ("dm crypt: use bio_add_page").  But Linux 4.0
commit cf2f1abfb ("dm crypt: don't allocate pages for a partial
request") regressed dm-crypt back to _not_ using bio_add_page().  But
given dm-crypt's cpu parallelization changes all depend on commit
cf2f1abfb's abandoning of the more complex io fragments processing that
dm-crypt previously had we cannot easily go back to using
bio_add_page().

So all said the cleanest way to resolve this issue is to fix dm-crypt to
properly constrain the original bios entering dm-crypt so the encryption
bios that dm-crypt generates from the original bios are always
compatible with the underlying device's max_segments queue limits.

It should be noted that technically Linux 4.3 does _not_ need this fix
because of the block core's new late bio-splitting capability.  But, it
is reasoned, there is little to be gained by having the block core split
the encrypted bio that is composed of PAGE_SIZE segments.  That said, in
the future we may revert this change.

Fixes: cf2f1abfb ("dm crypt: don't allocate pages for a partial request")
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=104421
Suggested-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodm thin: disable discard support for thin devices if pool's is disabled
Mike Snitzer [Tue, 8 Sep 2015 12:56:13 +0000 (08:56 -0400)]
dm thin: disable discard support for thin devices if pool's is disabled

commit 216076705d6ac291d42e0f8dd85e6a0da98c0fa3 upstream.

If the pool is configured with 'ignore_discard' its discard support is
disabled.  The pool's thin devices should also have queue_limits that
reflect discards are disabled.

Fixes: 34fbcf62 ("dm thin: range discard support")
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoPCI: Clear IORESOURCE_UNSET when clipping a bridge window
Bjorn Helgaas [Tue, 22 Sep 2015 22:03:54 +0000 (17:03 -0500)]
PCI: Clear IORESOURCE_UNSET when clipping a bridge window

commit b838b39e930aa1cfd099ea82ac40ed6d6413af26 upstream.

c770cb4cb505 ("PCI: Mark invalid BARs as unassigned") sets IORESOURCE_UNSET
if we fail to claim a resource.  If we tried to claim a bridge window,
failed, clipped the window, and tried to claim the clipped window, we
failed again because of IORESOURCE_UNSET:

  pci_bus 0000:00: root bus resource [mem 0xc0000000-0xffffffff window]
  pci 0000:00:01.0: can't claim BAR 15 [mem 0xbdf00000-0xddefffff 64bit pref]: no compatible bridge window
  pci 0000:00:01.0: [mem size 0x20000000 64bit pref] clipped to [mem size 0x1df00000 64bit pref]
  pci 0000:00:01.0:   bridge window [mem size 0x1df00000 64bit pref]
  pci 0000:00:01.0: can't claim BAR 15 [mem size 0x1df00000 64bit pref]: no address assigned

The 00:01.0 window started as [mem 0xbdf00000-0xddefffff 64bit pref].  That
starts before the host bridge window [mem 0xc0000000-0xffffffff window], so
we clipped the 00:01.0 window to [mem 0xc0000000-0xddefffff 64bit pref].
But we left it marked IORESOURCE_UNSET, so the second claim failed when it
should have succeeded.

This means downstream devices will also fail for lack of resources, e.g.,
in the bugzilla below,

  radeon 0000:01:00.0: Fatal error during GPU init

Clear IORESOURCE_UNSET when we clip a bridge window.  Also clear
IORESOURCE_UNSET in our copy of the unclipped window so we can see exactly
what the original window was and how it now fits inside the upstream
window.

Fixes: c770cb4cb505 ("PCI: Mark invalid BARs as unassigned")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=85491#c47
Based-on-patch-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Based-on-patch-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoPCI: Use function 0 VPD for identical functions, regular VPD for others
Alex Williamson [Wed, 16 Sep 2015 04:24:46 +0000 (22:24 -0600)]
PCI: Use function 0 VPD for identical functions, regular VPD for others

commit da2d03ea27f6ed9d2005a67b20dd021ddacf1e4d upstream.

932c435caba8 ("PCI: Add dev_flags bit to access VPD through function 0")
added PCI_DEV_FLAGS_VPD_REF_F0.  Previously, we set the flag on every
non-zero function of quirked devices.  If a function turned out to be
different from function 0, i.e., it had a different class, vendor ID, or
device ID, the flag remained set but we didn't make VPD accessible at all.

Flip this around so we only set PCI_DEV_FLAGS_VPD_REF_F0 for functions that
are identical to function 0, and allow regular VPD access for any other
functions.

[bhelgaas: changelog, stable tag]
Fixes: 932c435caba8 ("PCI: Add dev_flags bit to access VPD through function 0")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
Acked-by: Myron Stowe <myron.stowe@redhat.com>
Acked-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoPCI: Fix devfn for VPD access through function 0
Alex Williamson [Tue, 15 Sep 2015 17:17:21 +0000 (11:17 -0600)]
PCI: Fix devfn for VPD access through function 0

commit 9d9240756e63dd87d6cbf5da8b98ceb8f8192b55 upstream.

Commit 932c435caba8 ("PCI: Add dev_flags bit to access VPD through function
0") passes PCI_SLOT(devfn) for the devfn parameter of pci_get_slot().
Generally this works because we're fairly well guaranteed that a PCIe
device is at slot address 0, but for the general case, including
conventional PCI, it's incorrect.  We need to get the slot and then convert
it back into a devfn.

Fixes: 932c435caba8 ("PCI: Add dev_flags bit to access VPD through function 0")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
Acked-by: Myron Stowe <myron.stowe@redhat.com>
Acked-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agotools/lguest: Fix redefinition of struct virtio_pci_cfg_cap
Rusty Russell [Wed, 26 Aug 2015 01:12:26 +0000 (10:42 +0930)]
tools/lguest: Fix redefinition of struct virtio_pci_cfg_cap

commit e523caa601f4a7c2fa1ecd040db921baf7453798 upstream.

Ours uses a u32 for the data, since we ensure it's always
aligned and it's x86 so it doesn't matter anyway.

  lguest.c:128:8: error: redefinition of ‘struct virtio_pci_cfg_cap’

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Fixes: 3121bb023e2db ("virtio: define virtio_pci_cfg_cap in header.")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoBtrfs: update fix for read corruption of compressed and shared extents
Filipe Manana [Mon, 28 Sep 2015 08:56:26 +0000 (09:56 +0100)]
Btrfs: update fix for read corruption of compressed and shared extents

commit 808f80b46790f27e145c72112189d6a3be2bc884 upstream.

My previous fix in commit 005efedf2c7d ("Btrfs: fix read corruption of
compressed and shared extents") was effective only if the compressed
extents cover a file range with a length that is not a multiple of 16
pages. That's because the detection of when we reached a different range
of the file that shares the same compressed extent as the previously
processed range was done at extent_io.c:__do_contiguous_readpages(),
which covers subranges with a length up to 16 pages, because
extent_readpages() groups the pages in clusters no larger than 16 pages.
So fix this by tracking the start of the previously processed file
range's extent map at extent_readpages().

The following test case for fstests reproduces the issue:

  seq=`basename $0`
  seqres=$RESULT_DIR/$seq
  echo "QA output created by $seq"
  tmp=/tmp/$$
  status=1 # failure is the default!
  trap "_cleanup; exit \$status" 0 1 2 3 15

  _cleanup()
  {
      rm -f $tmp.*
  }

  # get standard environment, filters and checks
  . ./common/rc
  . ./common/filter

  # real QA test starts here
  _need_to_be_root
  _supported_fs btrfs
  _supported_os Linux
  _require_scratch
  _require_cloner

  rm -f $seqres.full

  test_clone_and_read_compressed_extent()
  {
      local mount_opts=$1

      _scratch_mkfs >>$seqres.full 2>&1
      _scratch_mount $mount_opts

      # Create our test file with a single extent of 64Kb that is going to
      # be compressed no matter which compression algo is used (zlib/lzo).
      $XFS_IO_PROG -f -c "pwrite -S 0xaa 0K 64K" \
          $SCRATCH_MNT/foo | _filter_xfs_io

      # Now clone the compressed extent into an adjacent file offset.
      $CLONER_PROG -s 0 -d $((64 * 1024)) -l $((64 * 1024)) \
          $SCRATCH_MNT/foo $SCRATCH_MNT/foo

      echo "File digest before unmount:"
      md5sum $SCRATCH_MNT/foo | _filter_scratch

      # Remount the fs or clear the page cache to trigger the bug in
      # btrfs. Because the extent has an uncompressed length that is a
      # multiple of 16 pages, all the pages belonging to the second range
      # of the file (64K to 128K), which points to the same extent as the
      # first range (0K to 64K), had their contents full of zeroes instead
      # of the byte 0xaa. This was a bug exclusively in the read path of
      # compressed extents, the correct data was stored on disk, btrfs
      # just failed to fill in the pages correctly.
      _scratch_remount

      echo "File digest after remount:"
      # Must match the digest we got before.
      md5sum $SCRATCH_MNT/foo | _filter_scratch
  }

  echo -e "\nTesting with zlib compression..."
  test_clone_and_read_compressed_extent "-o compress=zlib"

  _scratch_unmount

  echo -e "\nTesting with lzo compression..."
  test_clone_and_read_compressed_extent "-o compress=lzo"

  status=0
  exit

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Tested-by: Timofey Titovets <nefelim4ag@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoBtrfs: fix read corruption of compressed and shared extents
Filipe Manana [Mon, 14 Sep 2015 08:09:31 +0000 (09:09 +0100)]
Btrfs: fix read corruption of compressed and shared extents

commit 005efedf2c7d0a270ffbe28d8997b03844f3e3e7 upstream.

If a file has a range pointing to a compressed extent, followed by
another range that points to the same compressed extent and a read
operation attempts to read both ranges (either completely or part of
them), the pages that correspond to the second range are incorrectly
filled with zeroes.

Consider the following example:

  File layout
  [0 - 8K]                      [8K - 24K]
      |                             |
      |                             |
   points to extent X,         points to extent X,
   offset 4K, length of 8K     offset 0, length 16K

  [extent X, compressed length = 4K uncompressed length = 16K]

If a readpages() call spans the 2 ranges, a single bio to read the extent
is submitted - extent_io.c:submit_extent_page() would only create a new
bio to cover the second range pointing to the extent if the extent it
points to had a different logical address than the extent associated with
the first range. This has a consequence of the compressed read end io
handler (compression.c:end_compressed_bio_read()) finish once the extent
is decompressed into the pages covering the first range, leaving the
remaining pages (belonging to the second range) filled with zeroes (done
by compression.c:btrfs_clear_biovec_end()).

So fix this by submitting the current bio whenever we find a range
pointing to a compressed extent that was preceded by a range with a
different extent map. This is the simplest solution for this corner
case. Making the end io callback populate both ranges (or more, if we
have multiple pointing to the same extent) is a much more complex
solution since each bio is tightly coupled with a single extent map and
the extent maps associated to the ranges pointing to the shared extent
can have different offsets and lengths.

The following test case for fstests triggers the issue:

  seq=`basename $0`
  seqres=$RESULT_DIR/$seq
  echo "QA output created by $seq"
  tmp=/tmp/$$
  status=1 # failure is the default!
  trap "_cleanup; exit \$status" 0 1 2 3 15

  _cleanup()
  {
      rm -f $tmp.*
  }

  # get standard environment, filters and checks
  . ./common/rc
  . ./common/filter

  # real QA test starts here
  _need_to_be_root
  _supported_fs btrfs
  _supported_os Linux
  _require_scratch
  _require_cloner

  rm -f $seqres.full

  test_clone_and_read_compressed_extent()
  {
      local mount_opts=$1

      _scratch_mkfs >>$seqres.full 2>&1
      _scratch_mount $mount_opts

      # Create a test file with a single extent that is compressed (the
      # data we write into it is highly compressible no matter which
      # compression algorithm is used, zlib or lzo).
      $XFS_IO_PROG -f -c "pwrite -S 0xaa 0K 4K"        \
                      -c "pwrite -S 0xbb 4K 8K"        \
                      -c "pwrite -S 0xcc 12K 4K"       \
                      $SCRATCH_MNT/foo | _filter_xfs_io

      # Now clone our extent into an adjacent offset.
      $CLONER_PROG -s $((4 * 1024)) -d $((16 * 1024)) -l $((8 * 1024)) \
          $SCRATCH_MNT/foo $SCRATCH_MNT/foo

      # Same as before but for this file we clone the extent into a lower
      # file offset.
      $XFS_IO_PROG -f -c "pwrite -S 0xaa 8K 4K"         \
                      -c "pwrite -S 0xbb 12K 8K"        \
                      -c "pwrite -S 0xcc 20K 4K"        \
                      $SCRATCH_MNT/bar | _filter_xfs_io

      $CLONER_PROG -s $((12 * 1024)) -d 0 -l $((8 * 1024)) \
          $SCRATCH_MNT/bar $SCRATCH_MNT/bar

      echo "File digests before unmounting filesystem:"
      md5sum $SCRATCH_MNT/foo | _filter_scratch
      md5sum $SCRATCH_MNT/bar | _filter_scratch

      # Evicting the inode or clearing the page cache before reading
      # again the file would also trigger the bug - reads were returning
      # all bytes in the range corresponding to the second reference to
      # the extent with a value of 0, but the correct data was persisted
      # (it was a bug exclusively in the read path). The issue happened
      # only if the same readpages() call targeted pages belonging to the
      # first and second ranges that point to the same compressed extent.
      _scratch_remount

      echo "File digests after mounting filesystem again:"
      # Must match the same digests we got before.
      md5sum $SCRATCH_MNT/foo | _filter_scratch
      md5sum $SCRATCH_MNT/bar | _filter_scratch
  }

  echo -e "\nTesting with zlib compression..."
  test_clone_and_read_compressed_extent "-o compress=zlib"

  _scratch_unmount

  echo -e "\nTesting with lzo compression..."
  test_clone_and_read_compressed_extent "-o compress=lzo"

  status=0
  exit

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Qu Wenruo<quwenruo@cn.fujitsu.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agobtrfs: skip waiting on ordered range for special files
Jeff Mahoney [Sat, 12 Sep 2015 01:44:17 +0000 (21:44 -0400)]
btrfs: skip waiting on ordered range for special files

commit a30e577c96f59b1e1678ea5462432b09bf7d5cbc upstream.

In btrfs_evict_inode, we properly truncate the page cache for evicted
inodes but then we call btrfs_wait_ordered_range for every inode as well.
It's the right thing to do for regular files but results in incorrect
behavior for device inodes for block devices.

filemap_fdatawrite_range gets called with inode->i_mapping which gets
resolved to the block device inode before getting passed to
wbc_attach_fdatawrite_inode and ultimately to inode_to_bdi.  What happens
next depends on whether there's an open file handle associated with the
inode.  If there is, we write to the block device, which is unexpected
behavior.  If there isn't, we through normally and inode->i_data is used.
We can also end up racing against open/close which can result in crashes
when i_mapping points to a block device inode that has been closed.

Since there can't be any page cache associated with special file inodes,
it's safe to skip the btrfs_wait_ordered_range call entirely and avoid
the problem.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=100911
Tested-by: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoASoC: tas2552: fix dBscale-min declaration
Andreas Dannenberg [Mon, 5 Oct 2015 20:00:14 +0000 (15:00 -0500)]
ASoC: tas2552: fix dBscale-min declaration

commit e2600460bc3aa14ca1df86318a327cbbabedf9a8 upstream.

The minimum volume level for the TAS2552 (control register value 0x00)
is -7dB however the driver declares it as -0.07dB.

Running amixer before the patch reports:
dBscale-min=-0.07dB,step=1.00dB,mute=0

Running amixer with the patch applied reports:
dBscale-min=-7.00dB,step=1.00dB,mute=0

Signed-off-by: Andreas Dannenberg <dannenberg@ti.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoASoC: sgtl5000: fix wrong register MIC_BIAS_VOLTAGE setup on probe
Gianluca Renzi [Fri, 25 Sep 2015 19:33:41 +0000 (21:33 +0200)]
ASoC: sgtl5000: fix wrong register MIC_BIAS_VOLTAGE setup on probe

commit e256da84a04ea31c3c215997c847609af224e8f4 upstream.

Signed-off-by: Gianluca Renzi <gianlucarenzi@eurekelettronica.it>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoASoC: db1200: Fix DAI link format for db1300 and db1550
Lars-Peter Clausen [Fri, 25 Sep 2015 09:07:04 +0000 (11:07 +0200)]
ASoC: db1200: Fix DAI link format for db1300 and db1550

commit e74679b38c9417c1c524081121cdcdb36f82264d upstream.

Commit b4508d0f95fa ("ASoC: db1200: Use static DAI format setup") switched
the db1200 driver over to using static DAI format setup instead of a
callback function. But the commit only added the dai_fmt field to one of
the three DAI links in the driver. This breaks audio on db1300 and db1550.

Add the two missing dai_fmt settings to fix the issue.

Fixes: b4508d0f95fa ("ASoC: db1200: Use static DAI format setup")
Reported-by: Manuel Lauss <manuel.lauss@gmail.com>
Tested-by: Manuel Lauss <manuel.lauss@gmail.com>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoASoC: dwc: correct irq clear method
Yitian Bu [Fri, 2 Oct 2015 07:18:41 +0000 (15:18 +0800)]
ASoC: dwc: correct irq clear method

commit 4873867e5f2bd90faad861dd94865099fc3140f3 upstream.

from Designware I2S datasheet, tx/rx XRUN irq is cleared by
reading register TOR/ROR, rather than by writing into them.

Signed-off-by: Yitian Bu <yitian.bu@tangramtek.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoASoC: fix broken pxa SoC support
Robert Jarzmik [Tue, 15 Sep 2015 18:51:31 +0000 (20:51 +0200)]
ASoC: fix broken pxa SoC support

commit 3c8f7710c1c44fb650bc29b6ef78ed8b60cfaa28 upstream.

The previous fix of pxa library support, which was introduced to fix the
library dependency, broke the previous SoC behavior, where a machine
code binding pxa2xx-ac97 with a coded relied on :
 - sound/soc/pxa/pxa2xx-ac97.c
 - sound/soc/codecs/XXX.c

For example, the mioa701_wm9713.c machine code is currently broken. The
"select ARM" statement wrongly selects the soc/arm/pxa2xx-ac97 for
compilation, as per an unfortunate fate SND_PXA2XX_AC97 is both declared
in sound/arm/Kconfig and sound/soc/pxa/Kconfig.

Fix this by ensuring that SND_PXA2XX_SOC correctly triggers the correct
pxa2xx-ac97 compilation.

Fixes: 846172dfe33c ("ASoC: fix SND_PXA2XX_LIB Kconfig warning")
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoASoC: pxa: pxa2xx-ac97: fix dma requestor lines
Robert Jarzmik [Tue, 22 Sep 2015 19:20:22 +0000 (21:20 +0200)]
ASoC: pxa: pxa2xx-ac97: fix dma requestor lines

commit 8811191fdf7ed02ee07cb8469428158572d355a2 upstream.

PCM receive and transmit DMA requestor lines were reverted, breaking the
PCM playback interface for PXA platforms using the sound/soc/ variant
instead of the sound/arm variant.

The commit below shows the inversion in the requestor lines.

Fixes: d65a14587a9b ("ASoC: pxa: use snd_dmaengine_dai_dma_data")
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoALSA: hda - Disable power_save_node for IDT 92HD73xx chips
Takashi Iwai [Sun, 4 Oct 2015 20:44:12 +0000 (22:44 +0200)]
ALSA: hda - Disable power_save_node for IDT 92HD73xx chips

commit c7e1008048a97148d3aecae742f66fb2f944644c upstream.

The recent widget power saving introduced some unavoidable click
noises on old IDT 92HD73xx chips while it still seems working on the
compatible new chips.  In the bugzilla, we tried lots of tests and
workarounds, but they didn't help much.  So, let's disable the feature
for these specific chips as the least (but safest) fix.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=104981
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoALSA: hda - Apply SPDIF pin ctl to MacBookPro 12,1
John Flatness [Fri, 2 Oct 2015 21:07:49 +0000 (17:07 -0400)]
ALSA: hda - Apply SPDIF pin ctl to MacBookPro 12,1

commit e8ff581f7ac2bc3b8886094b7ca635dcc4d1b0e9 upstream.

The MacBookPro 12,1 has the same setup as the 11 for controlling the
status of the optical audio light. Simply apply the existing workaround
to the subsystem ID for the 12,1.

[sorted the fixup entry by tiwai]

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=105401
Signed-off-by: John Flatness <john@zerocrates.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoALSA: hda: Add dock support for ThinkPad T550
Laura Abbott [Fri, 2 Oct 2015 18:09:54 +0000 (11:09 -0700)]
ALSA: hda: Add dock support for ThinkPad T550

commit d05ea7da0e8f6df3c62cfee75538f347cb3d89ef upstream.

Much like all the other Lenovo laptops, add a quirk to make
sound work with docking.

Reported-and-tested-by: lacknerflo@gmail.com
Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoALSA: synth: Fix conflicting OSS device registration on AWE32
Takashi Iwai [Mon, 5 Oct 2015 14:55:09 +0000 (16:55 +0200)]
ALSA: synth: Fix conflicting OSS device registration on AWE32

commit 225db5762dc1a35b26850477ffa06e5cd0097243 upstream.

When OSS emulation is loaded on ISA SB AWE32 chip, we get now kernel
warnings like:
  WARNING: CPU: 0 PID: 2791 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x51/0x80()
  sysfs: cannot create duplicate filename '/devices/isa/sbawe.0/sound/card0/seq-oss-0-0'

It's because both emux synth and opl3 drivers try to register their
OSS device object with the same static index number 0.  This hasn't
been a big problem until the recent rewrite of device management code
(that exposes sysfs at the same time), but it's been an obvious bug.

This patch works around it just by using a different index number of
emux synth object.  There can be a more elegant way to fix, but it's
enough for now, as this code won't be touched so often, in anyway.

Reported-and-tested-by: Michael Shell <list1@michaelshell.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoALSA: hda - Disable power_save_node for Thinkpads
Takashi Iwai [Thu, 24 Sep 2015 15:36:51 +0000 (17:36 +0200)]
ALSA: hda - Disable power_save_node for Thinkpads

commit 7f57d803ee03730d570dc59a9e3e4842b58dd5cc upstream.

Lenovo Thinkpads with recent Realtek codecs seem suffering from click
noises at power transition since the introduction of widget power
saving in 4.1 kernel.  Although this might be solved by some delays in
appropriate points, as a quick workaround, just disable the
power_save_node feature for now.  The gain it gives is relatively
small, and this makes the situation back to pre 4.1 time.

This patch ended up with a bit more code changes than usual because
the existing fixup for Thinkpads is highly chained.  Instead of adding
yet another chain, combine a few of them into a single fixup entry, as
a gratis cleanup.

Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=943982
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoALSA: hda/tegra - async probe for avoiding module loading deadlock
Takashi Iwai [Thu, 24 Sep 2015 09:00:18 +0000 (11:00 +0200)]
ALSA: hda/tegra - async probe for avoiding module loading deadlock

commit 83510441bc08bee201c0ded9d81da6dfd008d69a upstream.

The Tegra HD-audio controller driver causes deadlocks when loaded as a
module since the driver invokes request_module() at binding with the
codec driver.  This patch works around it by deferring the probe in a
work like Intel HD-audio controller driver does.  Although hovering
the codec probe stuff into udev would be a better solution, it may
cause other regressions, so let's try this band-aid fix until the more
proper solution gets landed.

Reported-by: Thierry Reding <treding@nvidia.com>
Tested-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agomemcg: fix dirty page migration
Greg Thelen [Thu, 1 Oct 2015 22:37:02 +0000 (15:37 -0700)]
memcg: fix dirty page migration

commit 0610c25daa3e76e38ad5a8fae683a89ff9f71798 upstream.

The problem starts with a file backed dirty page which is charged to a
memcg.  Then page migration is used to move oldpage to newpage.

Migration:
 - copies the oldpage's data to newpage
 - clears oldpage.PG_dirty
 - sets newpage.PG_dirty
 - uncharges oldpage from memcg
 - charges newpage to memcg

Clearing oldpage.PG_dirty decrements the charged memcg's dirty page
count.

However, because newpage is not yet charged, setting newpage.PG_dirty
does not increment the memcg's dirty page count.  After migration
completes newpage.PG_dirty is eventually cleared, often in
account_page_cleaned().  At this time newpage is charged to a memcg so
the memcg's dirty page count is decremented which causes underflow
because the count was not previously incremented by migration.  This
underflow causes balance_dirty_pages() to see a very large unsigned
number of dirty memcg pages which leads to aggressive throttling of
buffered writes by processes in non root memcg.

This issue:
 - can harm performance of non root memcg buffered writes.
 - can report too small (even negative) values in
   memory.stat[(total_)dirty] counters of all memcg, including the root.

To avoid polluting migrate.c with #ifdef CONFIG_MEMCG checks, introduce
page_memcg() and set_page_memcg() helpers.

Test:
    0) setup and enter limited memcg
    mkdir /sys/fs/cgroup/test
    echo 1G > /sys/fs/cgroup/test/memory.limit_in_bytes
    echo $$ > /sys/fs/cgroup/test/cgroup.procs

    1) buffered writes baseline
    dd if=/dev/zero of=/data/tmp/foo bs=1M count=1k
    sync
    grep ^dirty /sys/fs/cgroup/test/memory.stat

    2) buffered writes with compaction antagonist to induce migration
    yes 1 > /proc/sys/vm/compact_memory &
    rm -rf /data/tmp/foo
    dd if=/dev/zero of=/data/tmp/foo bs=1M count=1k
    kill %
    sync
    grep ^dirty /sys/fs/cgroup/test/memory.stat

    3) buffered writes without antagonist, should match baseline
    rm -rf /data/tmp/foo
    dd if=/dev/zero of=/data/tmp/foo bs=1M count=1k
    sync
    grep ^dirty /sys/fs/cgroup/test/memory.stat

                       (speed, dirty residue)
             unpatched                       patched
    1) 841 MB/s 0 dirty pages          886 MB/s 0 dirty pages
    2) 611 MB/s -33427456 dirty pages  793 MB/s 0 dirty pages
    3) 114 MB/s -33427456 dirty pages  891 MB/s 0 dirty pages

    Notice that unpatched baseline performance (1) fell after
    migration (3): 841 -> 114 MB/s.  In the patched kernel, post
    migration performance matches baseline.

Fixes: c4843a7593a9 ("memcg: add per cgroup dirty page accounting")
Signed-off-by: Greg Thelen <gthelen@google.com>
Reported-by: Dave Hansen <dave.hansen@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agomm: hugetlbfs: skip shared VMAs when unmapping private pages to satisfy a fault
Mel Gorman [Thu, 1 Oct 2015 22:36:57 +0000 (15:36 -0700)]
mm: hugetlbfs: skip shared VMAs when unmapping private pages to satisfy a fault

commit 2f84a8990ebbe235c59716896e017c6b2ca1200f upstream.

SunDong reported the following on

  https://bugzilla.kernel.org/show_bug.cgi?id=103841

I think I find a linux bug, I have the test cases is constructed. I
can stable recurring problems in fedora22(4.0.4) kernel version,
arch for x86_64.  I construct transparent huge page, when the parent
and child process with MAP_SHARE, MAP_PRIVATE way to access the same
huge page area, it has the opportunity to lead to huge page copy on
write failure, and then it will munmap the child corresponding mmap
area, but then the child mmap area with VM_MAYSHARE attributes, child
process munmap this area can trigger VM_BUG_ON in set_vma_resv_flags
functions (vma - > vm_flags & VM_MAYSHARE).

There were a number of problems with the report (e.g.  it's hugetlbfs that
triggers this, not transparent huge pages) but it was fundamentally
correct in that a VM_BUG_ON in set_vma_resv_flags() can be triggered that
looks like this

 vma ffff8804651fd0d0 start 00007fc474e00000 end 00007fc475e00000
 next ffff8804651fd018 prev ffff8804651fd188 mm ffff88046b1b1800
 prot 8000000000000027 anon_vma           (null) vm_ops ffffffff8182a7a0
 pgoff 0 file ffff88106bdb9800 private_data           (null)
 flags: 0x84400fb(read|write|shared|mayread|maywrite|mayexec|mayshare|dontexpand|hugetlb)
 ------------
 kernel BUG at mm/hugetlb.c:462!
 SMP
 Modules linked in: xt_pkttype xt_LOG xt_limit [..]
 CPU: 38 PID: 26839 Comm: map Not tainted 4.0.4-default #1
 Hardware name: Dell Inc. PowerEdge R810/0TT6JF, BIOS 2.7.4 04/26/2012
 set_vma_resv_flags+0x2d/0x30

The VM_BUG_ON is correct because private and shared mappings have
different reservation accounting but the warning clearly shows that the
VMA is shared.

When a private COW fails to allocate a new page then only the process
that created the VMA gets the page -- all the children unmap the page.
If the children access that data in the future then they get killed.

The problem is that the same file is mapped shared and private.  During
the COW, the allocation fails, the VMAs are traversed to unmap the other
private pages but a shared VMA is found and the bug is triggered.  This
patch identifies such VMAs and skips them.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Reported-by: SunDong <sund_sky@126.com>
Reviewed-by: Michal Hocko <mhocko@suse.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: David Rientjes <rientjes@google.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoocfs2/dlm: fix deadlock when dispatch assert master
Joseph Qi [Tue, 22 Sep 2015 21:59:20 +0000 (14:59 -0700)]
ocfs2/dlm: fix deadlock when dispatch assert master

commit 012572d4fc2e4ddd5c8ec8614d51414ec6cae02a upstream.

The order of the following three spinlocks should be:
dlm_domain_lock < dlm_ctxt->spinlock < dlm_lock_resource->spinlock

But dlm_dispatch_assert_master() is called while holding
dlm_ctxt->spinlock and dlm_lock_resource->spinlock, and then it calls
dlm_grab() which will take dlm_domain_lock.

Once another thread (for example, dlm_query_join_handler) has already
taken dlm_domain_lock, and tries to take dlm_ctxt->spinlock deadlock
happens.

Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: "Junxiao Bi" <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agolib/iommu-common.c: do not try to deref a null iommu->lazy_flush() pointer when n...
Sowmini Varadhan [Tue, 22 Sep 2015 21:59:20 +0000 (14:59 -0700)]
lib/iommu-common.c: do not try to deref a null iommu->lazy_flush() pointer when n < pool->hint

commit d046b770c9fc36ccb19c27afdb8322220108cbc7 upstream.

The check for invoking iommu->lazy_flush() from iommu_tbl_range_alloc()
has to be refactored so that we only call ->lazy_flush() if it is
non-null.

I had a sparc kernel that was crashing when I was trying to process some
very large perf.data files- the crash happens when the scsi driver calls
into dma_4v_map_sg and thus the iommu_tbl_range_alloc().

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agomm: migrate: hugetlb: putback destination hugepage to active list
Naoya Horiguchi [Tue, 22 Sep 2015 21:59:14 +0000 (14:59 -0700)]
mm: migrate: hugetlb: putback destination hugepage to active list

commit 3aaa76e125c1dd58c9b599baa8c6021896874c12 upstream.

Since commit bcc54222309c ("mm: hugetlb: introduce page_huge_active")
each hugetlb page maintains its active flag to avoid a race condition
betwe= en multiple calls of isolate_huge_page(), but current kernel
doesn't set the f= lag on a hugepage allocated by migration because the
proper putback routine isn= 't called.  This means that users could
still encounter the race referred to by bcc54222309c in this special
case, so this patch fixes it.

Fixes: bcc54222309c ("mm: hugetlb: introduce page_huge_active")
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agospi: spidev: fix possible NULL dereference
Sudip Mukherjee [Thu, 10 Sep 2015 11:18:13 +0000 (16:48 +0530)]
spi: spidev: fix possible NULL dereference

commit dd85ebf681ef0ee1fc985c353dd45e8b53b5dc1e upstream.

During the last close we are freeing spidev if spidev->spi is NULL, but
just before checking if spidev->spi is NULL we are dereferencing it.
Lets add a check there to avoid the NULL dereference.

Fixes: 9169051617df ("spi: spidev: Don't mangle max_speed_hz in underlying spi device")
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Reviewed-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agospi: spi-pxa2xx: Check status register to determine if SSSR_TINT is disabled
Tan, Jui Nee [Tue, 1 Sep 2015 02:22:51 +0000 (10:22 +0800)]
spi: spi-pxa2xx: Check status register to determine if SSSR_TINT is disabled

commit 02bc933ebb59208f42c2e6305b2c17fd306f695d upstream.

On Intel Baytrail, there is case when interrupt handler get called, no SPI
message is captured. The RX FIFO is indeed empty when RX timeout pending
interrupt (SSSR_TINT) happens.

Use the BIOS version where both HSUART and SPI are on the same IRQ. Both
drivers are using IRQF_SHARED when calling the request_irq function. When
running two separate and independent SPI and HSUART application that
generate data traffic on both components, user will see messages like
below on the console:

  pxa2xx-spi pxa2xx-spi.0: bad message state in interrupt handler

This commit will fix this by first checking Receiver Time-out Interrupt,
if it is disabled, ignore the request and return without servicing.

Signed-off-by: Tan, Jui Nee <jui.nee.tan@intel.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agospi: bcm2835: BUG: fix wrong use of PAGE_MASK
Martin Sperl [Thu, 10 Sep 2015 09:32:14 +0000 (09:32 +0000)]
spi: bcm2835: BUG: fix wrong use of PAGE_MASK

commit 2a3fffd45822070309bcf0b1e1dae624d633824a upstream.

There is a bug in the alignment checking of transfers,
that results in DMA not being used for un-aligned
transfers that do not cross page-boundries, which is valid.

This is due to a missconception of the meaning PAGE_MASK
when implementing that check originally - (PAGE_SIZE - 1)
should have been used instead.

Also fixes a copy/paste error.

Reported-by: <robert@axium.co.nz>
Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agospi: xtensa-xtfpga: fix register endianness
Max Filippov [Tue, 22 Sep 2015 11:32:03 +0000 (14:32 +0300)]
spi: xtensa-xtfpga: fix register endianness

commit b0b4855099e301c8603ea37da9a0103a96c2e0b1 upstream.

XTFPGA SPI controller has native endian registers.
Fix register acessors so that they work in big-endian configurations.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agospi: Fix documentation of spi_alloc_master()
Guenter Roeck [Sat, 5 Sep 2015 22:46:54 +0000 (01:46 +0300)]
spi: Fix documentation of spi_alloc_master()

commit a394d635193b641f2c86ead5ada5b115d57c51f8 upstream.

Actually, spi_master_put() after spi_alloc_master() must _not_ be followed
by kfree(). The memory is already freed with the call to spi_master_put()
through spi_master_class, which registers a release function. Calling both
spi_master_put() and kfree() results in often nasty (and delayed) crashes
elsewhere in the kernel, often in the networking stack.

This reverts commit eb4af0f5349235df2e4a5057a72fc8962d00308a.

Link to patch and concerns: https://lkml.org/lkml/2012/9/3/269
or
http://lkml.iu.edu/hypermail/linux/kernel/1209.0/00790.html

Alexey Klimov: This revert becomes valid after
94c69f765f1b4a658d96905ec59928e3e3e07e6a when spi-imx.c
has been fixed and there is no need to call kfree() so comment
for spi_alloc_master() should be fixed.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Alexey Klimov <alexey.klimov@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agomemcg: make mem_cgroup_read_stat() unsigned
Greg Thelen [Thu, 1 Oct 2015 22:37:05 +0000 (15:37 -0700)]
memcg: make mem_cgroup_read_stat() unsigned

commit 484ebb3b8c8b27dd2171696462a3116edb9ff801 upstream.

mem_cgroup_read_stat() returns a page count by summing per cpu page
counters.  The summing is racy wrt.  updates, so a transient negative
sum is possible.  Callers don't want negative values:

 - mem_cgroup_wb_stats() doesn't want negative nr_dirty or nr_writeback.
   This could confuse dirty throttling.

 - oom reports and memory.stat shouldn't show confusing negative usage.

 - tree_usage() already avoids negatives.

Avoid returning negative page counts from mem_cgroup_read_stat() and
convert it to unsigned.

[akpm@linux-foundation.org: fix old typo while we're in there]
Signed-off-by: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoRevert "sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem"
Tejun Heo [Wed, 16 Sep 2015 15:51:12 +0000 (11:51 -0400)]
Revert "sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem"

commit 0c986253b939cc14c69d4adbe2b4121bdf4aa220 upstream.

This reverts commit d59cfc09c32a2ae31f1c3bc2983a0cd79afb3f14.

d59cfc09c32a ("sched, cgroup: replace signal_struct->group_rwsem with
a global percpu_rwsem") and b5ba75b5fc0e ("cgroup: simplify
threadgroup locking") changed how cgroup synchronizes against task
fork and exits so that it uses global percpu_rwsem instead of
per-process rwsem; unfortunately, the write [un]lock paths of
percpu_rwsem always involve synchronize_rcu_expedited() which turned
out to be too expensive.

Improvements for percpu_rwsem are scheduled to be merged in the coming
v4.4-rc1 merge window which alleviates this issue.  For now, revert
the two commits to restore per-process rwsem.  They will be re-applied
for the v4.4-rc1 merge window.

Signed-off-by: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/g/55F8097A.7000206@de.ibm.com
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoRevert "cgroup: simplify threadgroup locking"
Tejun Heo [Wed, 16 Sep 2015 15:51:12 +0000 (11:51 -0400)]
Revert "cgroup: simplify threadgroup locking"

commit f9f9e7b776142fb1c0782cade004cc8e0147a199 upstream.

This reverts commit b5ba75b5fc0e8404e2c50cb68f39bb6a53fc916f.

d59cfc09c32a ("sched, cgroup: replace signal_struct->group_rwsem with
a global percpu_rwsem") and b5ba75b5fc0e ("cgroup: simplify
threadgroup locking") changed how cgroup synchronizes against task
fork and exits so that it uses global percpu_rwsem instead of
per-process rwsem; unfortunately, the write [un]lock paths of
percpu_rwsem always involve synchronize_rcu_expedited() which turned
out to be too expensive.

Improvements for percpu_rwsem are scheduled to be merged in the coming
v4.4-rc1 merge window which alleviates this issue.  For now, revert
the two commits to restore per-process rwsem.  They will be re-applied
for the v4.4-rc1 merge window.

Signed-off-by: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/g/55F8097A.7000206@de.ibm.com
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agos390/boot/decompression: disable floating point in decompressor
Christian Borntraeger [Mon, 28 Sep 2015 20:47:42 +0000 (22:47 +0200)]
s390/boot/decompression: disable floating point in decompressor

commit adc0b7fbf6fe9967505c0254d9535ec7288186ae upstream.

my gcc 5.1 used an ldgr instruction with a register != 0,2,4,6 for
spilling/filling into a floating point register in our decompressor.

This will cause an AFP-register data exception as the decompressor
did not setup the additional floating point registers via cr0.
That causes a program check loop that looked like a hang with
one "Uncompressing Linux... " message (directly booted via kvm)
or a loop of "Uncompressing Linux... " messages (when booted via
zipl boot loader).

The offending code in my build was

   48e400:       e3 c0 af ff ff 71       lay     %r12,-1(%r10)
-->48e406:       b3 c1 00 1c             ldgr    %f1,%r12
   48e40a:       ec 6c 01 22 02 7f       clij    %r6,2,12,0x48e64e

but gcc could do spilling into an fpr at any function. We can
simply disable floating point support at that early stage.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agos390/compat: correct uc_sigmask of the compat signal frame
Martin Schwidefsky [Tue, 8 Sep 2015 13:25:39 +0000 (15:25 +0200)]
s390/compat: correct uc_sigmask of the compat signal frame

commit 8d4bd0ed0439dfc780aab801a085961925ed6838 upstream.

The uc_sigmask in the ucontext structure is an array of words to keep
the 64 signal bits (or 1024 if you ask glibc but the kernel sigset_t
only has 64 bits).

For 64 bit the sigset_t contains a single 8 byte word, but for 31 bit
there are two 4 byte words. The compat signal handler code uses a
simple copy of the 64 bit sigset_t to the 31 bit compat_sigset_t.
As s390 is a big-endian architecture this is incorrect, the two words
in the 31 bit sigset_t array need to be swapped.

Reported-by: Stefan Liebler <stli@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agosched/core: Fix TASK_DEAD race in finish_task_switch()
Peter Zijlstra [Tue, 29 Sep 2015 12:45:09 +0000 (14:45 +0200)]
sched/core: Fix TASK_DEAD race in finish_task_switch()

commit 95913d97914f44db2b81271c2e2ebd4d2ac2df83 upstream.

So the problem this patch is trying to address is as follows:

        CPU0                            CPU1

        context_switch(A, B)
                                        ttwu(A)
                                          LOCK A->pi_lock
                                          A->on_cpu == 0
        finish_task_switch(A)
          prev_state = A->state  <-.
          WMB                      |
          A->on_cpu = 0;           |
          UNLOCK rq0->lock         |
                                   |    context_switch(C, A)
                                   `--  A->state = TASK_DEAD
          prev_state == TASK_DEAD
            put_task_struct(A)
                                        context_switch(A, C)
                                        finish_task_switch(A)
                                          A->state == TASK_DEAD
                                            put_task_struct(A)

The argument being that the WMB will allow the load of A->state on CPU0
to cross over and observe CPU1's store of A->state, which will then
result in a double-drop and use-after-free.

Now the comment states (and this was true once upon a long time ago)
that we need to observe A->state while holding rq->lock because that
will order us against the wakeup; however the wakeup will not in fact
acquire (that) rq->lock; it takes A->pi_lock these days.

We can obviously fix this by upgrading the WMB to an MB, but that is
expensive, so we'd rather avoid that.

The alternative this patch takes is: smp_store_release(&A->on_cpu, 0),
which avoids the MB on some archs, but not important ones like ARM.

Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Cc: manfred@colorfullife.com
Cc: will.deacon@arm.com
Fixes: e4a52bcb9a18 ("sched: Remove rq->lock from the first half of ttwu()")
Link: http://lkml.kernel.org/r/20150929124509.GG3816@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoleds/led-class: Add missing put_device()
Ricardo Ribalda Delgado [Fri, 31 Jul 2015 11:36:21 +0000 (13:36 +0200)]
leds/led-class: Add missing put_device()

commit e5b5a61fcb3743f1dacf9e20d28f48423cecf0c1 upstream.

Devices found by class_find_device must be freed with put_device().
Otherwise the reference count will not work properly.

Fixes: a96aa64cb572 ("leds/led-class: Handle LEDs with the same name")
Reported-by: Alan Tull <delicious.quinoa@gmail.com>
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoleds:lp55xx: Correct Kconfig dependency for f/w user helper
Takashi Iwai [Mon, 7 Sep 2015 12:25:01 +0000 (14:25 +0200)]
leds:lp55xx: Correct Kconfig dependency for f/w user helper

commit 2338f73d407d5abe2036d92716ba25ef5279c3d2 upstream.

The commit [b67893206fc0: leds:lp55xx: fix firmware loading error]
tries to address the firmware file handling with user helper, but it
sets a wrong Kconfig CONFIG_FW_LOADER_USER_HELPER_FALLBACK.  Since the
wrong option was enabled, the system got a regression -- it suffers
from the unexpected long delays for non-present firmware files.

This patch corrects the Kconfig dependency to the right one,
CONFIG_FW_LOADER_USER_HELPER.  This doesn't change the fallback
behavior but only enables UMH when needed.

Bugzilla: https://bugzilla.opensuse.org/show_bug.cgi?id=944661
Fixes: b67893206fc0 ('leds:lp55xx: fix firmware loading error')
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agox86/xen: Support kexec/kdump in HVM guests by doing a soft reset
Vitaly Kuznetsov [Fri, 25 Sep 2015 09:59:52 +0000 (11:59 +0200)]
x86/xen: Support kexec/kdump in HVM guests by doing a soft reset

commit 0b34a166f291d255755be46e43ed5497cdd194f2 upstream.

Currently there is a number of issues preventing PVHVM Xen guests from
doing successful kexec/kdump:

  - Bound event channels.
  - Registered vcpu_info.
  - PIRQ/emuirq mappings.
  - shared_info frame after XENMAPSPACE_shared_info operation.
  - Active grant mappings.

Basically, newly booted kernel stumbles upon already set up Xen
interfaces and there is no way to reestablish them. In Xen-4.7 a new
feature called 'soft reset' is coming. A guest performing kexec/kdump
operation is supposed to call SCHEDOP_shutdown hypercall with
SHUTDOWN_soft_reset reason before jumping to new kernel. Hypervisor
(with some help from toolstack) will do full domain cleanup (but
keeping its memory and vCPU contexts intact) returning the guest to
the state it had when it was first booted and thus allowing it to
start over.

Doing SHUTDOWN_soft_reset on Xen hypervisors which don't support it is
probably OK as by default all unknown shutdown reasons cause domain
destroy with a message in toolstack log: 'Unknown shutdown reason code
5. Destroying domain.'  which gives a clue to what the problem is and
eliminates false expectations.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agox86/mm: Set NX on gap between __ex_table and rodata
Stephen Smalley [Thu, 1 Oct 2015 13:04:22 +0000 (09:04 -0400)]
x86/mm: Set NX on gap between __ex_table and rodata

commit ab76f7b4ab2397ffdd2f1eb07c55697d19991d10 upstream.

Unused space between the end of __ex_table and the start of
rodata can be left W+x in the kernel page tables.  Extend the
setting of the NX bit to cover this gap by starting from
text_end rather than rodata_start.

  Before:
  ---[ High Kernel Mapping ]---
  0xffffffff80000000-0xffffffff81000000          16M                               pmd
  0xffffffff81000000-0xffffffff81600000           6M     ro         PSE     GLB x  pmd
  0xffffffff81600000-0xffffffff81754000        1360K     ro                 GLB x  pte
  0xffffffff81754000-0xffffffff81800000         688K     RW                 GLB x  pte
  0xffffffff81800000-0xffffffff81a00000           2M     ro         PSE     GLB NX pmd
  0xffffffff81a00000-0xffffffff81b3b000        1260K     ro                 GLB NX pte
  0xffffffff81b3b000-0xffffffff82000000        4884K     RW                 GLB NX pte
  0xffffffff82000000-0xffffffff82200000           2M     RW         PSE     GLB NX pmd
  0xffffffff82200000-0xffffffffa0000000         478M                               pmd

  After:
  ---[ High Kernel Mapping ]---
  0xffffffff80000000-0xffffffff81000000          16M                               pmd
  0xffffffff81000000-0xffffffff81600000           6M     ro         PSE     GLB x  pmd
  0xffffffff81600000-0xffffffff81754000        1360K     ro                 GLB x  pte
  0xffffffff81754000-0xffffffff81800000         688K     RW                 GLB NX pte
  0xffffffff81800000-0xffffffff81a00000           2M     ro         PSE     GLB NX pmd
  0xffffffff81a00000-0xffffffff81b3b000        1260K     ro                 GLB NX pte
  0xffffffff81b3b000-0xffffffff82000000        4884K     RW                 GLB NX pte
  0xffffffff82000000-0xffffffff82200000           2M     RW         PSE     GLB NX pmd
  0xffffffff82200000-0xffffffffa0000000         478M                               pmd

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1443704662-3138-1-git-send-email-sds@tycho.nsa.gov
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agox86/process: Add proper bound checks in 64bit get_wchan()
Thomas Gleixner [Wed, 30 Sep 2015 08:38:22 +0000 (08:38 +0000)]
x86/process: Add proper bound checks in 64bit get_wchan()

commit eddd3826a1a0190e5235703d1e666affa4d13b96 upstream.

Dmitry Vyukov reported the following using trinity and the memory
error detector AddressSanitizer
(https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel).

[ 124.575597] ERROR: AddressSanitizer: heap-buffer-overflow on
address ffff88002e280000
[ 124.576801] ffff88002e280000 is located 131938492886538 bytes to
the left of 28857600-byte region [ffffffff81282e0affffffff82e0830a)
[ 124.578633] Accessed by thread T10915:
[ 124.579295] inlined in describe_heap_address
./arch/x86/mm/asan/report.c:164
[ 124.579295] #0 ffffffff810dd277 in asan_report_error
./arch/x86/mm/asan/report.c:278
[ 124.580137] #1 ffffffff810dc6a0 in asan_check_region
./arch/x86/mm/asan/asan.c:37
[ 124.581050] #2 ffffffff810dd423 in __tsan_read8 ??:0
[ 124.581893] #3 ffffffff8107c093 in get_wchan
./arch/x86/kernel/process_64.c:444

The address checks in the 64bit implementation of get_wchan() are
wrong in several ways:

 - The lower bound of the stack is not the start of the stack
   page. It's the start of the stack page plus sizeof (struct
   thread_info)

 - The upper bound must be:

       top_of_stack - TOP_OF_KERNEL_STACK_PADDING - 2 * sizeof(unsigned long).

   The 2 * sizeof(unsigned long) is required because the stack pointer
   points at the frame pointer. The layout on the stack is: ... IP FP
   ... IP FP. So we need to make sure that both IP and FP are in the
   bounds.

Fix the bound checks and get rid of the mix of numeric constants, u64
and unsigned long. Making all unsigned long allows us to use the same
function for 32bit as well.

Use READ_ONCE() when accessing the stack. This does not prevent a
concurrent wakeup of the task and the stack changing, but at least it
avoids TOCTOU.

Also check task state at the end of the loop. Again that does not
prevent concurrent changes, but it avoids walking for nothing.

Add proper comments while at it.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Based-on-patch-from: Wolfram Gloger <wmglo@dent.med.uni-muenchen.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@alien8.de>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: kasan-dev <kasan-dev@googlegroups.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Wolfram Gloger <wmglo@dent.med.uni-muenchen.de>
Link: http://lkml.kernel.org/r/20150930083302.694788319@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agox86/kexec: Fix kexec crash in syscall kexec_file_load()
Lee, Chun-Yi [Tue, 29 Sep 2015 12:58:57 +0000 (20:58 +0800)]
x86/kexec: Fix kexec crash in syscall kexec_file_load()

commit e3c41e37b0f4b18cbd4dac76cbeece5a7558b909 upstream.

The original bug is a page fault crash that sometimes happens
on big machines when preparing ELF headers:

    BUG: unable to handle kernel paging request at ffffc90613fc9000
    IP: [<ffffffff8103d645>] prepare_elf64_ram_headers_callback+0x165/0x260

The bug is caused by us under-counting the number of memory ranges
and subsequently not allocating enough ELF header space for them.
The bug is typically masked on smaller systems, because the ELF header
allocation is rounded up to the next page.

This patch modifies the code in fill_up_crash_elf_data() by using
walk_system_ram_res() instead of walk_system_ram_range() to correctly
count the max number of crash memory ranges. That's because the
walk_system_ram_range() filters out small memory regions that
reside in the same page, but walk_system_ram_res() does not.

Here's how I found the bug:

After tracing prepare_elf64_headers() and prepare_elf64_ram_headers_callback(),
the code uses walk_system_ram_res() to fill-in crash memory regions information
to the program header, so it counts those small memory regions that
reside in a page area.

But, when the kernel was using walk_system_ram_range() in
fill_up_crash_elf_data() to count the number of crash memory regions,
it filters out small regions.

I printed those small memory regions, for example:

  kexec: Get nr_ram ranges. vaddr=0xffff880077592258 paddr=0x77592258, sz=0xdc0

Based on the code in walk_system_ram_range(), this memory region
will be filtered out:

  pfn = (0x77592258 + 0x1000 - 1) >> 12 = 0x77593
  end_pfn = (0x77592258 + 0xfc0 -1 + 1) >> 12 = 0x77593
  end_pfn - pfn = 0x77593 - 0x77593 = 0  <=== if (end_pfn > pfn) is FALSE

So, the max_nr_ranges that's counted by the kernel doesn't include
small memory regions - causing us to under-allocate the required space.
That causes the page fault crash that happens in a later code path
when preparing ELF headers.

This bug is not easy to reproduce on small machines that have few
CPUs, because the allocated page aligned ELF buffer has more free
space to cover those small memory regions' PT_LOAD headers.

Signed-off-by: Lee, Chun-Yi <jlee@suse.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: kexec@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1443531537-29436-1-git-send-email-jlee@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agox86/efi: Fix boot crash by mapping EFI memmap entries bottom-up at runtime, instead...
Matt Fleming [Fri, 25 Sep 2015 22:02:18 +0000 (23:02 +0100)]
x86/efi: Fix boot crash by mapping EFI memmap entries bottom-up at runtime, instead of top-down

commit a5caa209ba9c29c6421292e7879d2387a2ef39c9 upstream.

Beginning with UEFI v2.5 EFI_PROPERTIES_TABLE was introduced
that signals that the firmware PE/COFF loader supports splitting
code and data sections of PE/COFF images into separate EFI
memory map entries. This allows the kernel to map those regions
with strict memory protections, e.g. EFI_MEMORY_RO for code,
EFI_MEMORY_XP for data, etc.

Unfortunately, an unwritten requirement of this new feature is
that the regions need to be mapped with the same offsets
relative to each other as observed in the EFI memory map. If
this is not done crashes like this may occur,

  BUG: unable to handle kernel paging request at fffffffefe6086dd
  IP: [<fffffffefe6086dd>] 0xfffffffefe6086dd
  Call Trace:
   [<ffffffff8104c90e>] efi_call+0x7e/0x100
   [<ffffffff81602091>] ? virt_efi_set_variable+0x61/0x90
   [<ffffffff8104c583>] efi_delete_dummy_variable+0x63/0x70
   [<ffffffff81f4e4aa>] efi_enter_virtual_mode+0x383/0x392
   [<ffffffff81f37e1b>] start_kernel+0x38a/0x417
   [<ffffffff81f37495>] x86_64_start_reservations+0x2a/0x2c
   [<ffffffff81f37582>] x86_64_start_kernel+0xeb/0xef

Here 0xfffffffefe6086dd refers to an address the firmware
expects to be mapped but which the OS never claimed was mapped.
The issue is that included in these regions are relative
addresses to other regions which were emitted by the firmware
toolchain before the "splitting" of sections occurred at
runtime.

Needless to say, we don't satisfy this unwritten requirement on
x86_64 and instead map the EFI memory map entries in reverse
order. The above crash is almost certainly triggerable with any
kernel newer than v3.13 because that's when we rewrote the EFI
runtime region mapping code, in commit d2f7cbe7b26a ("x86/efi:
Runtime services virtual mapping"). For kernel versions before
v3.13 things may work by pure luck depending on the
fragmentation of the kernel virtual address space at the time we
map the EFI regions.

Instead of mapping the EFI memory map entries in reverse order,
where entry N has a higher virtual address than entry N+1, map
them in the same order as they appear in the EFI memory map to
preserve this relative offset between regions.

This patch has been kept as small as possible with the intention
that it should be applied aggressively to stable and
distribution kernels. It is very much a bugfix rather than
support for a new feature, since when EFI_PROPERTIES_TABLE is
enabled we must map things as outlined above to even boot - we
have no way of asking the firmware not to split the code/data
regions.

In fact, this patch doesn't even make use of the more strict
memory protections available in UEFI v2.5. That will come later.

Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Chun-Yi <jlee@suse.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: James Bottomley <JBottomley@Odin.com>
Cc: Lee, Chun-Yi <jlee@suse.com>
Cc: Leif Lindholm <leif.lindholm@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Jones <pjones@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1443218539-7610-2-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoUse WARN_ON_ONCE for missing X86_FEATURE_NRIPS
Dirk Müller [Thu, 1 Oct 2015 11:43:42 +0000 (13:43 +0200)]
Use WARN_ON_ONCE for missing X86_FEATURE_NRIPS

commit d2922422c48df93f3edff7d872ee4f3191fefb08 upstream.

The cpu feature flags are not ever going to change, so warning
everytime can cause a lot of kernel log spam
(in our case more than 10GB/hour).

The warning seems to only occur when nested virtualization is
enabled, so it's probably triggered by a KVM bug.  This is a
sensible and safe change anyway, and the KVM bug fix might not
be suitable for stable releases anyway.

Signed-off-by: Dirk Mueller <dmueller@suse.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agox86/nmi/64: Fix a paravirt stack-clobbering bug in the NMI code
Andy Lutomirski [Sun, 20 Sep 2015 23:32:05 +0000 (16:32 -0700)]
x86/nmi/64: Fix a paravirt stack-clobbering bug in the NMI code

commit 83c133cf11fb0e68a51681447e372489f052d40e upstream.

The NMI entry code that switches to the normal kernel stack needs to
be very careful not to clobber any extra stack slots on the NMI
stack.  The code is fine under the assumption that SWAPGS is just a
normal instruction, but that assumption isn't really true.  Use
SWAPGS_UNSAFE_STACK instead.

This is part of a fix for some random crashes that Sasha saw.

Fixes: 9b6e6a8334d5 ("x86/nmi/64: Switch stacks on userspace NMI entry")
Reported-and-tested-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Link: http://lkml.kernel.org/r/974bc40edffdb5c2950a5c4977f821a446b76178.1442791737.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agox86/paravirt: Replace the paravirt nop with a bona fide empty function
Andy Lutomirski [Sun, 20 Sep 2015 23:32:04 +0000 (16:32 -0700)]
x86/paravirt: Replace the paravirt nop with a bona fide empty function

commit fc57a7c68020dcf954428869eafd934c0ab1536f upstream.

PARAVIRT_ADJUST_EXCEPTION_FRAME generates this code (using nmi as an
example, trimmed for readability):

    ff 15 00 00 00 00       callq  *0x0(%rip)        # 2796 <nmi+0x6>
              2792: R_X86_64_PC32     pv_irq_ops+0x2c

That's a call through a function pointer to regular C function that
does nothing on native boots, but that function isn't protected
against kprobes, isn't marked notrace, and is certainly not
guaranteed to preserve any registers if the compiler is feeling
perverse.  This is bad news for a CLBR_NONE operation.

Of course, if everything works correctly, once paravirt ops are
patched, it gets nopped out, but what if we hit this code before
paravirt ops are patched in?  This can potentially cause breakage
that is very difficult to debug.

A more subtle failure is possible here, too: if _paravirt_nop uses
the stack at all (even just to push RBP), it will overwrite the "NMI
executing" variable if it's called in the NMI prologue.

The Xen case, perhaps surprisingly, is fine, because it's already
written in asm.

Fix all of the cases that default to paravirt_nop (including
adjust_exception_frame) with a big hammer: replace paravirt_nop with
an asm function that is just a ret instruction.

The Xen case may have other problems, so document them.

This is part of a fix for some random crashes that Sasha saw.

Reported-and-tested-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Link: http://lkml.kernel.org/r/8f5d2ba295f9d73751c33d97fda03e0495d9ade0.1442791737.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agox86/pci/intel_mid_pci: Work around for IRQ0 assignment
Andy Shevchenko [Wed, 29 Jul 2015 09:16:47 +0000 (12:16 +0300)]
x86/pci/intel_mid_pci: Work around for IRQ0 assignment

commit 39d9b77b8debb4746e189aa5b61ae6e81ec5eab8 upstream.

On Intel Tangier the MMC host controller is wired up to irq 0. But
several other devices have irq 0 associated as well due to a bogus PCI
configuration.

The first initialized driver will acquire irq 0 and make it
unavailable for other devices. If the sdhci driver is not the first
one it will fail to acquire the interrupt and therefor be non
functional.

Add a quirk to the pci irq enable function which denies irq 0 to
anything else than the MMC host controller driver on Tangier
platforms.

Fixes: 90b9aacf912a (serial: 8250_pci: add Intel Tangier support)
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Link: http://lkml.kernel.org/r/1438161409-4671-2-git-send-email-andriy.shevchenko@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agox86/ioapic: Force affinity setting in setup_ioapic_dest()
Thomas Gleixner [Mon, 14 Sep 2015 10:00:55 +0000 (12:00 +0200)]
x86/ioapic: Force affinity setting in setup_ioapic_dest()

commit 4857c91f0d195f05908fff296ba1ec5fca87066c upstream.

The recent ioapic cleanups changed the affinity setting in
setup_ioapic_dest() from a direct write to the hardware to the delayed
affinity setup via irq_set_affinity().

That results in a warning from chained_irq_exit():
WARNING: CPU: 0 PID: 5 at kernel/irq/migration.c:32 irq_move_masked_irq
[<ffffffff810a0a88>] irq_move_masked_irq+0xb8/0xc0
[<ffffffff8103c161>] ioapic_ack_level+0x111/0x130
[<ffffffff812bbfe8>] intel_gpio_irq_handler+0x148/0x1c0

The reason is that irq_set_affinity() does not write directly to the
hardware. It marks the affinity setting as pending and executes it
from the next interrupt. The chained handler infrastructure does not
take the irq descriptor lock for performance reasons because such a
chained interrupt is not visible to any interfaces. So the delayed
affinity setting triggers the warning in irq_move_masked_irq().

Restore the old behaviour by calling the set_affinity function of the
ioapic chip in setup_ioapic_dest(). This is safe as none of the
interrupts can be on the fly at this point.

Fixes: aa5cb97f14a2 'x86/irq: Remove x86_io_apic_ops.set_affinity and related interfaces'
Reported-and-tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: jarkko.nikula@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agox86/platform: Fix Geode LX timekeeping in the generic x86 build
David Woodhouse [Wed, 16 Sep 2015 13:10:03 +0000 (14:10 +0100)]
x86/platform: Fix Geode LX timekeeping in the generic x86 build

commit 03da3ff1cfcd7774c8780d2547ba0d995f7dc03d upstream.

In 2007, commit 07190a08eef36 ("Mark TSC on GeodeLX reliable")
bypassed verification of the TSC on Geode LX. However, this code
(now in the check_system_tsc_reliable() function in
arch/x86/kernel/tsc.c) was only present if CONFIG_MGEODE_LX was
set.

OpenWRT has recently started building its generic Geode target
for Geode GX, not LX, to include support for additional
platforms. This broke the timekeeping on LX-based devices,
because the TSC wasn't marked as reliable:
https://dev.openwrt.org/ticket/20531

By adding a runtime check on is_geode_lx(), we can also include
the fix if CONFIG_MGEODEGX1 or CONFIG_X86_GENERIC are set, thus
fixing the problem.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: Andres Salomon <dilinger@queued.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Marcelo Tosatti <marcelo@kvack.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1442409003.131189.87.camel@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>