Thursday, January 27, 2011

Per-processor Data

A new article Per-processor Data is available under Lockfree/Tips&Tricks section. It introduces an interesting technique of per-processor data, as well as some implementation aspects.

46 comments:

  1. FYI http://www.kernel.org/doc/man-pages/online/pages/man2/getcpu.2.html

    VERSIONS
    getcpu() was added in kernel 2.6.19 for x86_64 and i386.

    ReplyDelete
  2. And the code to use it since the glibc doesn't export it is:


    /* This code is in the public domain */
    #define _GNU_SOURCE
    #include
    #include
    #include

    struct getcpu_cache {
    unsigned long blob[128 / sizeof(long)];
    };

    static long (*vgetcpu)(unsigned *cpu, unsigned *node, struct getcpu_cache *tcache);

    static int init_vgetcpu(void)
    {
    void *vdso;

    dlerror();
    vdso = dlopen("linux-vdso.so.1", RTLD_LAZY);
    if (vdso == NULL)
    return -1;
    vgetcpu = dlsym(vdso, "__vdso_getcpu");
    dlclose(vdso);
    return vgetcpu == NULL ? -1 : 0;
    }

    int main(void)
    {
    unsigned cpu, node;
    struct getcpu_cache cache;

    if (init_vgetcpu() < 0) {
    fprintf(stderr, "Unable to locate vgetcpu: %s", dlerror());
    return -1;
    }

    if (vgetcpu(&cpu, &node, &cache) < 0) {
    perror("vgetcpu");
    return -1;
    }
    printf("cpu:%d node:%d\n", cpu, node);
    return 0;
    }


    and using it gives:
    $ repeat 10 ./vgetcpu
    cpu:0 node:0
    cpu:1 node:0
    cpu:0 node:0
    cpu:1 node:0
    cpu:0 node:0
    cpu:1 node:0
    cpu:1 node:0
    cpu:0 node:0
    cpu:1 node:0
    cpu:1 node:0

    (this is a core2 duo, I suppose that node is != 0 when this is a hardware thread but I don't have access to HT machines right now so somebody could probably confirm that)

    ReplyDelete
  3. Which seems to only work on X64, I must reckon I've no idea how vdso works on i386 (but who cares anyway ;p)

    ReplyDelete
  4. and I'm wrong, glibc returns it through sched_getcpu... "woopsie" ;)

    ReplyDelete
  5. getcpu is still rather heavy-weight compared to a direct call to rdtscp, vsyscall is definitely cheap in some cases but it isn't free. Per-processor distribution is nice, but sometimes you may simply want cache line distribution in which case you can use a simple subscribe/unsubscribe interface (which will also at least help alleviate some hotspot pains).

    For example,
    pid = subscribe(data_structure);
    operation(pid, data_structure);
    ...
    ...
    unsubscribe(data_structure, pid);

    where pid may be a map to a unique cache line. Additionally, if CPU migration is common and you're working with relatively small data sets then locality benefits may not exceed the cost of cache line invalidation.

    ReplyDelete
  6. @MadCoder Thanks for your suggestions. sched_getcpu() is not quite vgetcpu(). It seems that sched_getcpu() initially was a slow syscall, but I see that there is a constant progress going on:
    http://kerneltrap.org/mailarchive/linux-kernel/2008/9/23/3387414
    I gave sched_getcpu() one more try, and it seems that now it's quite fast (potentially implemented with the same SIDT or LSL).
    I will update the article with this and other aspects.

    ReplyDelete
  7. @sbahra
    It seems that sched_getcpu() is rather fast nowadays:
    http://kerneltrap.org/mailarchive/linux-kernel/2008/9/23/3387414
    Indeed, there is not reason why it can't be reimplemented with SIDT/LSL (which a cheaper that RDTSCP because there are not serializing instructions).

    > but sometimes you may simply want cache line distribution

    Well, actually I implied that it's a baseline, and what a lot of people are able to do. So it's vice versa :)
    Per-thread subscription/unsubscription/handoff is usually leads to quite complex implementations.

    > if CPU migration is common

    I think then nothing will help you :)

    ReplyDelete
  8. I've updated the article with sched_getcpu(), LSL, and added clear indication that "remember in user-space it's no more than an optimization - a thread can be rescheduled to another CPU straight after it has obtained the number".

    ReplyDelete
  9. My point was regarding overheads associated with VDSO (and DSO in general) on large multi-processor machines (where topology is not a simple 1-hop network) and not so much the specific implementation of getcpu. vgetcpu has been in Linux since 2.6.24 and sched_getcpu is implemented in terms of vgetcpu. However even on smaller multi-processor systems, getcpu is simply is not an option since sched_getcpu may still be implemented in terms of a full system call (RHEL5 is still a standard).

    Subscribe/unsubscribe can provide a guarantee of cache line ownership everywhere while processor ID cannot do this in user-space if hard processor affinity is not used. You can collide very easily. In kernel-space, as you point out the ownership can be guaranteed. For memory-intensive user-space applications if the actual topology is an important factor then it is possible to set hard affinity (on Linux, see sched_setaffinity, simple interface example for RR distribution in http://codepad.org/IKLC2YkE) but this isn't suitable for applications that include a fair bit of thread deactivation and/or thread migration.

    The additional complexity is totally negligible considering the attractive guarantees it can provide (in an operating system agnostic manner) if subscribe/unsubscribe frequency is low.

    ReplyDelete
  10. Note, also, you can also use TLS to implement a similar technique but abstract away the ID from the interface and all these may be coupled with the nice guarantee on some operating systems with first-touch allocation policies for NUMA (providing essentially the same benefits as using getcpu, but again, in a portable manner).

    ReplyDelete
  11. @sbahra What you are saying makes perfect sense.
    However, engineering is always about choice and trade-offs. So I just want to show alternatives and describe their trade-offs.
    By the way, here is a draft of a new article "Distributed Reader-Writer Mutex" (it's not yet officially published, but almost done):
    http://www.1024cores.net/home/lock-free-algorithms/reader-writer-problem/distributed-reader-writer-mutex
    It shows how to build a very simple rw mutex, which still shows very good scalability.
    I've done the same with per-thread approach, and I can say that the implementation is *significantly* more complicated. The problem is with synchronization between arriving/departing threads and writers, and between departing threads and mutex destruction. And by the way, it's not that easy to catch thread completion event under Windows.

    ReplyDelete
  12. vgetcpu is a vsyscall on recent linuces, and its implementation is fast and uses RDTSCP or a per-cpu variable:

    http://www.cs.fsu.edu/~baker/devices/lxr/http/source/linux/arch/x86/vdso/vgetcpu.c

    ReplyDelete
  13. Thanks for the info! Nice to hear!
    Now I'm crossing my fingers for SYS_membarrier.

    ReplyDelete
  14. Hello Dmitry,

    Thank you for the ideas in the paper! I gave those three approaches (RDTSCP, CPUID, SIDT) a try in my microbenchmark (parallel_for over a team of 16 threads with 1 mill of tasks of (1) get_cpu_num(); (2) read a buffer of a predefined size from a memory pool and compute a check sum for the buffer (xor ints). Turns out reading from 0x7FFE0000 does not work on Windows 7. Using NtQuerySystemTime works just as good, but the return value is in 100 ns intervals, so simple != comparison in amortization function needs to be modified too.
    Performance of the different approaches were as you'd expect (or as you predicted), which is SIDT is the fastest, then amortized RDTSCP, and then CPUID is the slowest one. Curious thing however is that caching latest proc_num (and apic or idt alongside of it) in TLS makes the code slower really. In my case search among 16 values (registered apics or idts for all the logical CPUs I got) worked 5% faster than the cached version of the same code.

    ReplyDelete
  15. Another finding - if you use GetSystemTimeAsFileTime instead of NtQuerySystemTime on Windows 7, it does the following under the covers:
    mov edx,dword ptr [7FFE0018h]
    mov r8d,dword ptr [7FFE0014h]
    mov eax,dword ptr [7FFE001Ch]
    so, I guess "accessing the address" functionality is still there, only the address has changed a little bit.

    ReplyDelete
  16. Hi Anton,

    Thanks for the info!

    >Turns out reading from 0x7FFE0000 does not work on...

    Of course it is a dirty hack, and production code should include a dynamic check for known OS versions with a fall-back to fair GetTickCount() call.

    I guess that you are working on a relatively modern version of Windows, so it would be interesting to see numbers for GetCurrentProcessorNumber().

    >Curious thing however is that caching latest proc_num (and apic or idt alongside of it) in TLS makes the code slower...

    Hummm... That is *not* as I would expect... Do you implement TLS with __declspec(thread)? Do you prevent several accesses to TLS?

    ReplyDelete
  17. for beginners like me need a lot of reading and searching for information on various blogs. and articles that you share a very nice and inspires me .
    cara menggugurkan kandungan
    obat aborsi
    kalkulator kehamilan

    ReplyDelete
  18. Your initial step ought to be to discover the best rate of payday credit on offer. Whichever payday credit organization you may pick you should comprehend what will be the settled sum that you will owe after finishing your payday advance. Visit here Payday Loans Chicago

    ReplyDelete
  19. Java Assignment help
    All Assignment Help is a web portal where students get help in making assignments for all the subjects, with the help of our experts. You will get 100% plagiarism free assignment. Expert’s consultation is also available for students. If they have any query they can contact with our experts anytime.
    Law assignment help

    ReplyDelete
  20. you can go through AllAssignmentHelp reviews. It will help you to know about the services, and it will help you to know whether that site is reliable or not. This is a way of understanding the opinion of users, and it helps the website to maintain its functionality and reliability.

    ReplyDelete
  21. Get help from professional coursework writer in UK to ease your writing tasks if you are struggling hard. We are the only online coursework writing service that caters to your every need. We are the finest British online coursework help agency to help students for almost a decade now. Coursework Help & Writing Service from Trusted Coursework Helpers.We always keep ourselves a step further from rest

    ReplyDelete
  22. To write a great argumentative essays the Australian students search essay writing help, Even if you want to get HD grades meet professional Essay Writers to helped in your essay assessment. Our goal is to provide professional essay writing tips and provide services how to write an essay in a HG manners for Australian and international students. Get our help now to get your essays written by university assignment help experts!

    ReplyDelete
  23. Students are given assignments, such as an essay, on a regular or periodic basis by their professors. These are mandatory and students are graded on the basis of the quality of work done by them. Essay writing may be a great opportunity to score the top grades but they come along with challenges as well. To overcome them, the best way is to consult an online academic helper who provides essay help. TutorVersal has a dedicated team of qualified essay and homework help experts who assist students with their assignment tasks. This not allows students to submit their work on time but also lets them score excellent grades in them!

    ReplyDelete
  24. Online Assignment Help provided by the experts in the information management in Australia. For students of various Australian universities, we make available the expert assignment writing help in IT management, ICT management and mass communication.

    Marketing Assignment Help gained recognition as a business process in the early 20th century. Since then, it has remained vital and central to organizational processes. Being central to both managerial as well as social process of an organization.

    ReplyDelete
  25. Thank you for a resourceful blog. Students often require an online assignment help provider who can assist them in completing lengthy and complex assessment tasks on time. There are a lot of academic services available on the internet but only a few provide them with what they want. TutorVersal has a team of highly qualified MYOB assignment help professionals who provide students with academic assistance. Our range of services include one-to-one subject consultation, plagiarism-checking, and fully-customized assignments right at the student’s doorstep. In the past five years, we have catered to thousands of students pursuing their education from the universities across Australia. The ProModel simulation assignment help experts are sure that with the help of our experienced academic writers, you will be able to overcome all your challenges in writing university assignments.

    ReplyDelete
  26. I have gone through your content and I found it more relevant to what I was looking for! I really appreciate the way you have written this content including various information. However, this is to inform you about My Assignment Help Oz who provides the nonstop AutoCAD assignment help services to students who are studying at university. It offers round the clock AutoCAD assignment services with the help of subject matter experts who have at least five years of experience in the field of academic help. I am an AutoCAD assignment writer working at My Assignment Help Oz. At this company, we are bound to deliver the best AutoCAD assignment help in Australia and the rest of the countries. If you are also a student of engineering and willing to get in touch with engineering assignment help experts, just feel free to reach us immediately.

    ReplyDelete
  27. All the students who are looking for a case study assignment help, please pay attention. My Assignment Services is a destination that thousands of students turn to when they are in need of an expert consultation. The team of experts delivering assignment help to the students are either master’s degree holders of have PhD doctorate. Therefore, every assignment that we have ever written is of supreme quality and international standard for global competition. Students who are searching for companies with “who can do my assignment for me”, My Assignment Services is your destination. We have an expertise when it comes to writing original and unique answers and deliver the work under strict deadline on time. The case study assignment help with us is a service prepared keeping in mind the needs of the students. An affordable assignment help from best experts is now available!

    ReplyDelete
  28. Many of the people are depressed about the problems of essay writing. Well, don’t worry about that because we are providing this service at a very reasonable price.
    Do My Homework
    Do My Assignment

    ReplyDelete
  29. I found this one pretty fascinating and it should go into my collection. Very good work! I am Impressed. We appreciate that please keep going to write more content. We are the assignment helper, we provide services all over the globe. We are best in these:- services
    Assignment Help
    Assignment Helper
    Essay writing
    Essay writing service
    Dissertation help
    Thesis writing help
    Write My Essay
    Computer Science Assignment Help
    Assignment Help South Africa
    Assignment Writing Service

    ReplyDelete
  30. I loved the article, keep updating interesting articles. I will be a regular reader I am offering assignment help to students over the globe at a low price.
    Essay writing
    Essay Writer
    seo writing service
    Essay writing service
    Essay writing help
    Write My Essay
    hire seo writer
    hire writer
    Write my essay cheap
    hire article writer

    ReplyDelete
  31. ABC Assignment Help is an incomparable online Accounting assignment help company delivering excellent academic assignments, essays, coursework and reports. Through a team of over 3000 subject experts we ensure individual attention to every student making the assignment help experience completely personalized in nature. With our round the clock services, you can be assured of high grades every time.

    ReplyDelete
  32. I found this one pretty fascinating and it should go into my collection. Very good work! I am Impressed. We appreciate that please keep writing more content. We are the assignment helper, we provide services all over the globe. We are best in these:- services

    Ecommerce CMS
    Web Development Tools
    Social bookmarking sites
    Benefits Of Social Media Marketing

    ReplyDelete
  33. I found this one pretty fascinating and it should go into my collection. Very good work! I am Impressed. We appreciate that please keep writing more content. We are the assignment helper, we provide services all over the globe. We are best in these:- services
    Best Assignment Help
    What is the Importance of Statistics?
    What is the best way to learn java?
    What is the best way to learn Python?
    What is computer science?
    How to write an essay plan?

    ReplyDelete
  34. ABC Assignment Help is an incomparable online Finance assignment help company delivering excellent academic assignments, essays, coursework and reports. Through a team of over 3000 subject experts we ensure individual attention to every student making the assignment help experience completely personalized in nature. With our round the clock services, you can be assured of high grades every time.

    ReplyDelete
  35. If you love playing card games, Playrummy.com provide you rummy online here! Join us now and play rummy games to win real cash.Play Online Rummy Card Games on India’s Largest Rummy Website. 100% Safe & Sure - Download the rummy app now!

    ReplyDelete