From adrian@jacinta.home.cacheboy.net Thu Mar 13 03:58:47 2008 Return-Path: Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7FC381065671 for ; Thu, 13 Mar 2008 03:58:47 +0000 (UTC) (envelope-from adrian@jacinta.home.cacheboy.net) Received: from jacinta.home.cacheboy.net (home.cacheboy.net [203.56.15.74]) by mx1.freebsd.org (Postfix) with ESMTP id BC0C28FC1B for ; Thu, 13 Mar 2008 03:58:45 +0000 (UTC) (envelope-from adrian@jacinta.home.cacheboy.net) Received: from jacinta.home.cacheboy.net (localhost [127.0.0.1]) by jacinta.home.cacheboy.net (8.14.2/8.14.1) with ESMTP id m2C37sJb006395 for ; Wed, 12 Mar 2008 03:07:54 GMT (envelope-from adrian@jacinta.home.cacheboy.net) Received: (from adrian@localhost) by jacinta.home.cacheboy.net (8.14.2/8.14.1/Submit) id m2C37pO7006394; Wed, 12 Mar 2008 03:07:51 GMT (envelope-from adrian) Message-Id: <200803120307.m2C37pO7006394@jacinta.home.cacheboy.net> Date: Wed, 12 Mar 2008 03:07:51 GMT From: Adrian Chadd Reply-To: Adrian Chadd To: FreeBSD-gnats-submit@freebsd.org Cc: Subject: hwpmc(4) incorrectly handles PMC sampling events from AMD X-Send-Pr-Version: 3.113 X-GNATS-Notify: >Number: 121660 >Category: kern >Synopsis: [hwpmc] [patch] hwpmc(4) incorrectly handles PMC sampling events from AMD >Confidential: no >Severity: non-critical >Priority: low >Responsible: jkoshy >State: feedback >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Thu Mar 13 04:00:07 UTC 2008 >Closed-Date: >Last-Modified: Thu Apr 10 08:59:13 UTC 2008 >Originator: Adrian Chadd >Release: FreeBSD 8.0-CURRENT i386 >Organization: FreeBSD >Environment: System: FreeBSD jacinta.home.cacheboy.net 8.0-CURRENT FreeBSD 8.0-CURRENT #5: Sun Mar 9 19:34:11 UTC 2008 adrian@jacinta.home.cacheboy.net:/data/1/obj/usr/src/sys/JACINTA i386 Copyright (c) 1992-2008 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 8.0-CURRENT #5: Sun Mar 9 19:34:11 UTC 2008 adrian@jacinta.home.cacheboy.net:/data/1/obj/usr/src/sys/JACINTA Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: AMD Athlon(tm) XP 1800+ (1540.49-MHz 686-class CPU) Origin = "AuthenticAMD" Id = 0x681 Stepping = 1 Features=0x383fbff AMD Features=0xc0400800 real memory = 2147418112 (2047 MB) avail memory = 2093957120 (1996 MB) MPTable: ioapic0: Assuming intbase of 0 ioapic0 irqs 0-23 on motherboard kbd1 at kbdmux0 ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) cpu0 on motherboard pcib0: pcibus 0 on motherboard pci0: on pcib0 agp0: on hostb0 agp0: aperture size is 256M pcib1: at device 1.0 on pci0 pci1: on pcib1 vgapci0: mem 0xd8000000-0xd9ffffff,0xda000000-0xda003fff,0xdb000000-0xdb7fffff irq 16 at device 0.0 on pci1 em0: port 0xd000-0xd03f mem 0xde020000-0xde03ffff,0xde000000-0xde01ffff irq 19 at device 11.0 on pci0 em0: [FILTER] em0: Ethernet address: 00:0e:0c:b9:4c:f9 uhci0: port 0xd400-0xd41f irq 21 at device 16.0 on pci0 uhci0: [GIANT-LOCKED] uhci0: [ITHREAD] usb0: on uhci0 usb0: USB revision 1.0 uhub0: on usb0 uhub0: 2 ports with 2 removable, self powered uhci1: port 0xd800-0xd81f irq 21 at device 16.1 on pci0 uhci1: [GIANT-LOCKED] uhci1: [ITHREAD] usb1: on uhci1 usb1: USB revision 1.0 uhub1: on usb1 uhub1: 2 ports with 2 removable, self powered uhci2: port 0xdc00-0xdc1f irq 21 at device 16.2 on pci0 uhci2: [GIANT-LOCKED] uhci2: [ITHREAD] usb2: on uhci2 usb2: USB revision 1.0 uhub2: on usb2 uhub2: 2 ports with 2 removable, self powered ehci0: mem 0xde040000-0xde0400ff irq 19 at device 16.3 on pci0 ehci0: [GIANT-LOCKED] ehci0: [ITHREAD] usb3: EHCI version 1.0 usb3: companion controllers, 2 ports each: usb0 usb1 usb2 usb3: on ehci0 usb3: USB revision 2.0 uhub3: on usb3 uhub3: 6 ports with 6 removable, self powered isab0: at device 17.0 on pci0 isa0: on isab0 atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xe000-0xe00f at device 17.1 on pci0 ata0: on atapci0 ata0: [ITHREAD] ata1: on atapci0 ata1: [ITHREAD] pci0: at device 17.5 (no driver attached) pmtimer0 on isa0 orm0: at iomem 0xc0000-0xc7fff,0xc8000-0xc8fff pnpid ORM0000 on isa0 atkbdc0: at port 0x60,0x64 on isa0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] fdc0: at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0 fdc0: [FILTER] ppc0: at port 0x378-0x37f irq 7 on isa0 ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode ppbus0: on ppc0 ppbus0: [ITHREAD] plip0: on ppbus0 plip0: WARNING: using obsoleted IFF_NEEDSGIANT flag lpt0: on ppbus0 lpt0: Interrupt-driven port ppi0: on ppbus0 ppc0: [GIANT-LOCKED] ppc0: [ITHREAD] sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 16550A sio0: [FILTER] sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled sio1 at port 0x2f8-0x2ff irq 3 on isa0 sio1: type 16550A sio1: [FILTER] vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 unknown: can't assign resources (port) unknown: can't assign resources (memory) unknown: can't assign resources (port) unknown: can't assign resources (port) unknown: can't assign resources (port) unknown: can't assign resources (port) Timecounter "TSC" frequency 1540490006 Hz quality 800 Timecounters tick every 1.000 msec ad0: 38166MB at ata0-master UDMA100 Trying to mount root from ufs:/dev/ad0s1a WARNING: / was not properly dismounted WARNING: /data/1 was not properly dismounted WARNING: /home was not properly dismounted WARNING: /var was not properly dismounted ipfw2 (+ipv6) initialized, divert loadable, nat loadable, rule-based forwarding disabled, default to deny, logging disabled >Description: hwpmc returns nothing from a user-mode sample (pmcstat -P instructions -O sample.out -t ) The hwpmc registers are 48 bit on at least the Athlon XP platform (and the code sets the width value in the class for all AMD CPUs to 48, not 64). The sampling is done by counting upwards until the counter loops and generates an interrupt. The code uses a 2s compliment trick to turn the sample period counter into a counter number useful for generating an NMI. It does this on a 64 bit value but as the counters are 48 bit, it will read back a 48 bit value with the high 16 bits set to 0 and the virtual PMC stuff quickly loses track. My email to -current has more info: Between my Athlon XP box giving me no useful pmc stats and my new Core 2 duo box not even working with pmc, I decided to poke at the Athlon XP support a bit to see if I could figure out what was going on. It seems that at least my revision of the Athlon XP has 48 bit performance counters (AMD Athlon Processor x86 Code Optimisation Guide, page 235 (Performance-Monitoring Counters: Overview) and the top 16 bits read back 0x0000. Since the code is taking the 2's compliment of the stored PMC value (which is so the value is incremented to 0xffffffffffffffff and wraps over, generating an NMI - mentioned on page 240), negating the value gives humerous results: (Note: some of these are my own debugging information.) Mar 9 16:09:43 jacinta kernel: hwpmc: TSC/1/0x20 K7/4/0x1ff Mar 9 16:10:02 jacinta kernel: MDP:SWO:1: pc=0xc5814180 pp=0 enable-msr=0 Mar 9 16:10:02 jacinta kernel: local initial: ri 1: 65536 Mar 9 16:10:02 jacinta kernel: MDP:SWO:1: pc=0xc5814180 pp=0xc576a780 enable-msr=0 Mar 9 16:10:02 jacinta kernel: csw_in: ri 1; pmcval 65536 Mar 9 16:10:02 jacinta kernel: MDP:WRI:1: amd-write cpu=0 ri=1 v=ffffffffffff0000 Mar 9 16:10:02 jacinta kernel: MDP:SWI:1: pc=0xc5814180 pp=0xc576a780 enable-msr=0 Mar 9 16:10:02 jacinta kernel: MDP:REA:1: amd-read id=1 class=1 Mar 9 16:10:02 jacinta kernel: MDP:REA:2: amd-read id=1 -> ffff00000000ff01 Mar 9 16:10:02 jacinta kernel: read: ffff00000000ff01; saved 10000; diff -281474976710911 Mar 9 16:10:02 jacinta kernel: csw_out: ri 1: pp_pmcval 65536.. Mar 9 16:10:02 jacinta kernel: csw_out: ... ri 1: pp_pmcval now 281474976710911.. Mar 9 16:10:02 jacinta kernel: MDP:SWO:1: pc=0xc5814180 pp=0xc576a780 enable-msr=0 Mar 9 16:10:02 jacinta kernel: csw_in: ri 1; pmcval 281474976710911 Mar 9 16:10:02 jacinta kernel: MDP:WRI:1: amd-write cpu=0 ri=1 v=fffeffffffffff01 Mar 9 16:10:02 jacinta kernel: MDP:SWI:1: pc=0xc5814180 pp=0xc576a780 enable-msr=0 Mar 9 16:10:02 jacinta kernel: MDP:REA:1: amd-read id=1 class=1 Mar 9 16:10:02 jacinta kernel: MDP:REA:2: amd-read id=1 -> ffff00000000f47f Mar 9 16:10:02 jacinta kernel: read: ffff00000000f47f; saved 10000000000ff; diff -562949953358976 Mar 9 16:10:02 jacinta kernel: csw_out: ri 1: pp_pmcval 281474976710911.. Mar 9 16:10:02 jacinta kernel: csw_out: ... ri 1: pp_pmcval now 844424930004351.. >How-To-Repeat: pmcstat -P instructions -O sample.out -t pid >Fix: This attempts to "pretend" to be the expected value - and it begins recording sample events in the above test - but I don't believe its correct. If the value rolls over somehow then we'll be OR'ing in high bits inappropriately. I think it should be a sign-extend rather than my OR operation. Index: hwpmc_amd.c =================================================================== RCS file: /share/FreeBSD/cvsrepo/src/sys/dev/hwpmc/hwpmc_amd.c,v retrieving revision 1.14 diff -u -r1.14 hwpmc_amd.c --- hwpmc_amd.c 7 Dec 2007 08:20:15 -0000 1.14 +++ hwpmc_amd.c 10 Mar 2008 12:06:49 -0000 @@ -303,7 +303,12 @@ tmp = rdmsr(pd->pm_perfctr); /* RDMSR serializes */ if (PMC_IS_SAMPLING_MODE(mode)) - *v = AMD_PERFCTR_VALUE_TO_RELOAD_COUNT(tmp); + /* + * The counters are 48 bit - so we need to "pretend" the 48 bit value + * is 64 bit for the 2s compliment conversion to convert correctly. + * I don't think this is "correct" answer! + */ + *v = AMD_PERFCTR_VALUE_TO_RELOAD_COUNT(tmp | 0xffff000000000000); else *v = tmp; >Release-Note: >Audit-Trail: Responsible-Changed-From-To: freebsd-bugs->jkoshy Responsible-Changed-By: rwatson Responsible-Changed-When: Thu Mar 13 12:41:28 UTC 2008 Responsible-Changed-Why: Over to maintainer. http://www.freebsd.org/cgi/query-pr.cgi?pr=121660 From: "Adrian Chadd" To: bug-followup@FreeBSD.org, adrian@freebsd.org Cc: Subject: Re: kern/121660: [hwpmc] [patch] hwpmc(4) incorrectly handles PMC sampling events from AMD Date: Sun, 16 Mar 2008 14:07:58 +0900 This patch sign extends the 48 bit to 64 bits before applying 2s compliement to restore the original sampling value. It seems to work for me. Index: hwpmc_amd.c =================================================================== RCS file: /share/FreeBSD/cvsrepo/src/sys/dev/hwpmc/hwpmc_amd.c,v retrieving revision 1.14 diff -u -r1.14 hwpmc_amd.c --- hwpmc_amd.c 7 Dec 2007 08:20:15 -0000 1.14 +++ hwpmc_amd.c 16 Mar 2008 05:02:38 -0000 @@ -302,12 +302,22 @@ #endif tmp = rdmsr(pd->pm_perfctr); /* RDMSR serializes */ - if (PMC_IS_SAMPLING_MODE(mode)) - *v = AMD_PERFCTR_VALUE_TO_RELOAD_COUNT(tmp); - else - *v = tmp; + PMCDBG(MDP,REA,2,"amd-read (pre-munge) id=%d -> %jd", ri, tmp); + if (PMC_IS_SAMPLING_MODE(mode)) { + /* + * The counters are 48 bit but we expect them to be 64 bit for + * 2s compliment "hack" done to allow the "count up" to overflow + * which triggers the sampling mode NMI. + * This code sign-extends the 48 bit number in case the returned + * value requires it. + */ + if (tmp & 0x0000800000000000) + tmp |= 0xffff000000000000; + tmp = AMD_PERFCTR_VALUE_TO_RELOAD_COUNT(tmp); + } + *v = tmp; - PMCDBG(MDP,REA,2,"amd-read id=%d -> %jd", ri, *v); + PMCDBG(MDP,REA,2,"amd-read (post-munge) id=%d -> %jd", ri, *v); return 0; } @@ -683,7 +693,7 @@ KASSERT(cpu >= 0 && cpu < mp_ncpus, ("[amd,%d] out of range CPU %d", __LINE__, cpu)); - PMCDBG(MDP,INT,1, "cpu=%d tf=0x%p um=%d", cpu, (void *) tf, + PMCDBG(MDP,INT,1, "cpu=%d tf=%p um=%d", cpu, (void *) tf, TRAPF_USERMODE(tf)); retval = 0; From: Joseph Koshy To: "Adrian Chadd" Cc: jkoshy@FreeBSD.org, bug-followup@freebsd.org Subject: Re: kern/121660: [hwpmc] [patch] hwpmc(4) incorrectly handles PMC sampling events from AMD Date: Mon, 17 Mar 2008 05:58:27 -0000 > This patch sign extends the 48 bit to 64 bits before applying 2s > compliement to restore the original sampling value. It seems to work > for me. That is great news. I have a simpler patch to offer, though. This replaces the test and branch with two shifts. Index: hwpmc_amd.c =================================================================== RCS file: /cvs/FreeBSD/src/sys/dev/hwpmc/hwpmc_amd.c,v retrieving revision 1.9.2.1 diff -u -r1.9.2.1 hwpmc_amd.c --- hwpmc_amd.c 15 Sep 2005 15:48:16 -0000 1.9.2.1 +++ hwpmc_amd.c 16 Mar 2008 16:48:27 -0000 @@ -298,9 +298,11 @@ #endif tmp = rdmsr(pd->pm_perfctr); /* RDMSR serializes */ - if (PMC_IS_SAMPLING_MODE(mode)) + if (PMC_IS_SAMPLING_MODE(mode)) { + /* Sign extend 48 bit value to 64 bits. */ + tmp = (pmc_value_t) (((int64_t) tmp << 16) >> 16); *v = AMD_PERFCTR_VALUE_TO_RELOAD_COUNT(tmp); - else + } else *v = tmp; PMCDBG(MDP,REA,2,"amd-read id=%d -> %jd", ri, *v); The use of 2's complement isn't a 'hack'; these counters only count up, so if you want sample every 'n' events, you have to program the counters with an initial value that is 'n' less than the overflow limit. This is documented in the manual, IIRC. The additional debug messages you've added are fine; if you feel they will be to be useful to someone in the future, feel free to add them to CVS. Regards, Koshy From: dfilter@FreeBSD.ORG (dfilter service) To: bug-followup@FreeBSD.org Cc: Subject: Re: kern/121660: commit references a PR Date: Tue, 18 Mar 2008 08:39:18 +0000 (UTC) adrian 2008-03-18 08:39:12 UTC FreeBSD src repository Modified files: sys/dev/hwpmc hwpmc_amd.c Log: Sign-extend the 48-bit AMD PMC counter before treating it to a 64-bit 2's compliment. The 2's compliment transform is done so a "count down" sampling interval can be converted into a "count up" PMC value. a 2's complimented 'count down' value is written to the PMC counter; then the read-back counter is reverted via another 2's compliment. PR: kern/121660 Reviewed by: jkoshy Approved by: jkoshy MFC after: 1 week Revision Changes Path 1.16 +8 -5 src/sys/dev/hwpmc/hwpmc_amd.c _______________________________________________ cvs-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/cvs-all To unsubscribe, send any mail to "cvs-all-unsubscribe@freebsd.org" From: Joseph Koshy To: Joseph Koshy Cc: "Adrian Chadd" , bug-followup@freebsd.org Subject: Re: kern/121660: [hwpmc] [patch] hwpmc(4) incorrectly handles PMC sampling events from AMD Date: Wed, 19 Mar 2008 05:15:42 -0000 jk> I have a simpler patch to offer, though. This replaces the test and jk> branch with two shifts. And here's an even simpler patch that keeps the flow of the code in hwpmc_amd.c intact. It is cleaner to keep the conversion to and from the perf counter's representation and the MI representation within these macros instead of partially inside and partially outside them. Index: hwpmc_amd.h =================================================================== RCS file: /cvs/FreeBSD/src/sys/dev/hwpmc/hwpmc_amd.h,v retrieving revision 1.1 diff -u -r1.1 hwpmc_amd.h --- hwpmc_amd.h 9 Jun 2005 19:45:07 -0000 1.1 +++ hwpmc_amd.h 18 Mar 2008 16:22:03 -0000 @@ -79,8 +79,9 @@ #define AMD_PMC_IS_STOPPED(evsel) ((rdmsr((evsel)) & AMD_PMC_ENABLE) == 0) #define AMD_PMC_HAS_OVERFLOWED(pmc) ((rdpmc(pmc) & (1ULL << 47)) == 0) -#define AMD_RELOAD_COUNT_TO_PERFCTR_VALUE(V) (-(V)) -#define AMD_PERFCTR_VALUE_TO_RELOAD_COUNT(P) (-(P)) +#define AMD_PERFCTR_LIMIT 0x0000FFFFFFFFFFFFULL +#define AMD_RELOAD_COUNT_TO_PERFCTR_VALUE(V) (AMD_PERFCTR_LIMIT - (V) + 1) +#define AMD_PERFCTR_VALUE_TO_RELOAD_COUNT(P) (AMD_PERFCTR_LIMIT - (P) + 1) struct pmc_md_amd_op_pmcallocate { uint32_t pm_amd_config; Could you let me know if this change works ok? Koshy State-Changed-From-To: open->feedback State-Changed-By: jkoshy State-Changed-When: Thu Apr 10 08:58:34 UTC 2008 State-Changed-Why: Awaiting feedback on revised patch. http://www.freebsd.org/cgi/query-pr.cgi?pr=121660 >Unformatted: