Giter VIP home page Giter VIP logo

Comments (8)

SeKwonLee avatar SeKwonLee commented on May 28, 2024

Hi @mkatsa,

Thank you for your interests in our work. Could you provide more details about your experiments and results for me to better understand them? I think a few papers (that I attached below as well as cite our paper) studying performance and scalability of PM indexes would be helpful for you to obtain some hints about your results.

  1. https://par.nsf.gov/servlets/purl/10222958
  2. https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9656149

from recipe.

mkatsa avatar mkatsa commented on May 28, 2024

Thank you for your feedback on this.

We have a dual-socket server with an Intel Xeon Gold 5218R processor with 80 cores and 2x256 Intel Optane DIMMs configured in App Direct mode and 4x32 DRAM DIMMs. We configure and mount the file system exactly as you indicate in this repository. We are trying to replicate the results you acquire for the alternative indexes by following exactly the same process that you present. Even though in DRAM execution the alternative indexes seem to scale as we increase the number of threads. However, when we utilized the libvmmalloc to map all the dynamic allocations in Optane, we do not see similar performance. For instance, for 16 threads for P-Masstree you achieve up to ~9.5 MOps/sec , while in our case we achieve up to ~1.5 MOps/sec for the YCSB workloads. We observe this for all the indexes alternatives. Do we miss something in the experimental setup? Is there maybe an Optane configuration that could help achieve optimized throughput?

Thanks

from recipe.

SeKwonLee avatar SeKwonLee commented on May 28, 2024

I think you should check a PM bandwidth issue in your setting. We used 6x128GB (768GB) Intel Optane DIMMs to measure performance in our paper, but you are employing 2 DIMMs with 256GB Optane.

from recipe.

mkatsa avatar mkatsa commented on May 28, 2024

I understand that this may be a significant factor. However, for workloads around 64GB as in YCSB, doesn't the memory controller map the execution of the application to a single DIMM(as it can fit in there), in order to avoid NUMA effects such as remote accesses?

from recipe.

SeKwonLee avatar SeKwonLee commented on May 28, 2024

I am not sure if the memory controller can restrict the execution of the application even if the workload size fits in a single DIMM as long as address space interleaving is enabled. Furthermore, the PM bandwidth issue that I pointed out is different from the NUMA effects. If you use the small number of PM DIMMs (1 or 2), it could easily throttle the performance of indexes. For more details about this issue, please refer to Section 5.10 in the paper that I attach below.

from recipe.

mkatsa avatar mkatsa commented on May 28, 2024

Thank you for your feedback. I suppose that your system has 3 Optane DIMMs per socket. Thus from the 6 DIMMs that you have in your 2-socket system you actually utilize 3 DIMMs in one socket and your single-socket experiments were utilizing 3 DIMMs? If not, have you found a way to use all the DIMMs as a single namespace?

from recipe.

SeKwonLee avatar SeKwonLee commented on May 28, 2024

Thank you for your follow-up questions. I actually have no idea now about how the DIMMs in our experimental setup were utilized while running our experiments. However, regardless of it, 2 DIMMs sound too small to evaluate the maximum performance. Could you check your experiments on another PM machine that has more number of DIMMs per socket if available?

from recipe.

SeKwonLee avatar SeKwonLee commented on May 28, 2024

Closing this issue now. Please reopen if you are still facing the same issue.

from recipe.

Related Issues (13)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.