March 10, 2011

Hybrid disk simulation using DiskSim

DiskSim is a well known simulator for storage developed at CMU. It simulates disks or SSD devices (with an extension from Microsoft). Unfortunately it does not support to combine both, having a hybrid disk like Seagates Momentus. The following few paragraphs show how you to implement a hybrid disk.

For convenience you can also download the hybrid disksim package here including all changes.

First download DiskSim and the SSD extension to a 32-bit Linux machine. Extract Disksim and extract the ssdmodel folder from the extension into the disksim-4.0 folder. Follow the directions of the README file in the ssdmodel folder. Run make in the disksim-4.0 folder and run the validation scripts of disksim as well as from the ssd extension.

Download this file (download) which includes the diff of the original disksim version and the hybrid disksim. Apply the changes. Recompile the entire package.

Now you should be able to instantiate the following layout of components:
topology disksim_iodriver driver0 [
   disksim_bus bus_dr2c [
      disksim_ctlr ctlr0 [ 
         disksim_bus bus_c2di [
            disksim_disk disk0 []
         ],
         disksim_bus bus_c2f [
             ssdmodel_ssd ssd0 []
         ]
      ]
   ]
]

Here the controller has two different buses which separate the accesses to the disk or the SSD. A complete parameter file can be found here.

Now you're ready to use it.

Important Downloads:

65 comments:

  1. Hello Anjo
    Thank you for posting this solution online. I did run into a problem trying run make in the disksim-4.0 folder after extracting the hybrid version of the disksim.

    make[1]: Entering directory `/home/HybridDiskSim/diskmodel'
    make[1]: *** No rule to make target `/usr/include/bits/predefs.h', needed by `mech_g1_seektime.o'. Stop.
    make[1]: Leaving directory `/home/HybridDiskSim/diskmodel'
    make: *** [all] Error 2
    [cira@op00 HybridDiskSim]

    Do you have any advice?
    Thanks
    Cristian Cira

    ReplyDelete
  2. Hi Cristian
    I just have vague conjectures. First let me ask a few questions...

    Does this error also occur if you try to make the usual disksim? What system do you run?

    It is surprising that you linux does not have this header file. Please make sure that you installed libc6-dev and also for your respective hardware (i368 or AMD64). Check whether the folder /usr/include/bits exist.

    Hope this helps. Please keep me posted, if you find the error.

    Thanks,
    Anjo

    ReplyDelete
  3. Anjo,

    thank you for your reply. I got back to the disksim this monday and solved my problem. I'm using CentOS and on my machine the two files stddef.h and stdarg.h are located on

    /usr/lib/gcc/i386-redhat-linux/4.1.0/include/stddef.h
    /usr/lib/gcc/i386-redhat-linux/4.1.0/include/stdarg.h

    Make was looking for them in /usr/lib/gcc/i486-linux-gnu/4.1.1/include/stdarg.h

    Changing those paths in the *.d files solved the problem and i was able to build your hybrid version of disksim and run your maxtor + ssd model.

    Thank you again for posting this,
    Cristi

    ReplyDelete
  4. Anjo,

    I keep bombarding you with questions :) Sorry.

    Your link "A complete parameter file can be found here." points to a prameter file that only contains one hdd and no ssd. Here is the topology from that file

    # system topology
    topology disksim_iodriver driver0 [
    disksim_bus bus_dr2c [
    disksim_ctlr ctlr0 [
    disksim_bus bus_c2di [
    disksim_disk disk0 []
    ] # end of bus_c2di
    ] # end of ctlr0
    ] # end of bus_dr2c
    ] # end of system topology

    Have you managed to simulate a hybrid HDD+SSD system (idealy with 2 buses and one controler)?

    Thanks you,

    Cristian Cira
    PhD student
    PASL CSSE
    Auburn University

    ReplyDelete
  5. Hi,

    Thanks for your hint!

    Please excuse the mistake. I changed the example.parv to one including the topology described in the post.

    We already run successfully experiments with the hybrid disksim.

    Anjo

    ReplyDelete
  6. Anjo,

    I am yongseok and interested in your work, hybriddisksim.

    I have a question. What is ReplicationFactor, IndexLayout, PolicyLayout, and etc in example.parv?

    I could not find these keywords in your source code.

    Thank you,

    Yongseok

    ReplyDelete
  7. Yongsuk,

    Thank you very much for your interest.

    These parameters are not used in the Hybrid implementation. I changed the .prav file so that it should be fine now.

    If you have any further question, do not hesitate to contact me.

    Anjo

    ReplyDelete
  8. Anjo,

    I am still trying to run your example.parv on the HybridDiskSim that I downloaded, succesfully compiled, run and tested (on the 'valid' example provided by the simulator and Microsoft).

    Every time I run the simulator with a hybrid system (not only yours, I tried some by myself) I get this error

    [cira@op00 HybridDiskSim]$ ./src/disksim /mytest/anjo.parv stdout ascii 0 1

    *** Output file name: stdout
    *** Input trace format: ascii
    *** I/O trace used: 0
    *** Synthgen to be used?: 1

    *** assertion failed: in disksim_loadparams() (disksim_loadparams.c:93): disksim->parfile != NULL: /mytest/anjo.parv
    Aborted

    or

    *** assertion failed: in DISKSIM_GLOBAL_STAT_DEFINITION_FILE_loader() (disksim_global_param.c:114): 0: Couldn't find statdefs file in path
    Aborted

    or a good old classic "Segmentation fault"

    Do you have any ideas what am I doing wrong? hopefully is something obvious and simple that I am missing :)

    Thank you for you time

    This assertion fails

    ReplyDelete
  9. Extra remark

    The segmentation fault error is preceded by "SSDCache not valid in context disksim_ctlr "

    [cira@op00 HybridDiskSim]$ ./src/disksim mytest/anjo.parv stdout ascii 0 1

    *** Output file name: stdout
    *** Input trace format: ascii
    *** I/O trace used: 0
    *** Synthgen to be used?: 1

    *** warning: parameter SSDCache not valid in context disksim_ctlr
    Segmentation fault
    [cira@op00 HybridDiskSim]$

    Thank you

    ReplyDelete
  10. Hi Christi,

    Thank you for your interest!

    Let me consider each crash separately:
    1) *** assertion failed: in disksim_loadparams() (disksim_loadparams.c:93): disksim->parfile != NULL: /mytest/anjo.parv

    Here the problem is that it cannot open the parameter file for a read. Please check again the location of the file. Maybe it is ./mytest/anjo.parv instead of /mytest/anjo.parv which is located at the root and not in your home directory?

    2) *** assertion failed: in DISKSIM_GLOBAL_STAT_DEFINITION_FILE_loader() (disksim_global_param.c:114): 0: Couldn't find statdefs file in path

    A file called statdefs has to be in the folder of the diskmodel. You can download one from the diskmodels folder (http://www.mpi-sws.org/~vahldiek/diskmodels/statdefs).

    3) Segfault
    I found an error in the example.parv. It includes a separate cache for the flash which is not the case in the way the disksim_ctlrsmart is implemented. (I took this example.parv file from my extended implementation and thought I removed everything, but this SSDCache was still there). The current version which is online should work now.

    Thank you very much again, your usage helps to make the hybriddisksim more robust.

    Anjo

    ReplyDelete
  11. Anjo,

    Thank you very much again for your help! The hybrid example works fine now.

    The first error was a mistake of mine. Sorry for that.

    I had the statdefs file in the diskmodel/ folder but it looks like the simulator was looking for it in mytest/ (the folder I created to store my parv files and where anjo.parv was also stored). Copying the file in mytest/ solved the second issue.

    Thank you for the third solution!! I am new to disksim and trying to get familiar with it. Finding that design error would of been a very time consuming for me know.

    I'll keep posting remarks and issues as I run into them so others have a reference.

    Thanks
    Have a great day,
    Cristi

    ReplyDelete
  12. hi, Anjo,
    I followed your comments and compiled your Hybriddisksim successfully. However, when I run ./src/disksim example.parv stdout ascii 0 1, it seems that something was wrong when initiation. The output was as followed:

    Max per-disk pending count = 1
    } # end of CTLR0 spec

    *** assertion failed: in lp_instantiate() (util.c:589): spec != 0: no such type Proc.

    Aborted

    How to fix it?

    Thank you!

    ReplyDelete
  13. The hybrid-simulator and the parameter file work just fine. You are missing the .diskspecs and .model files for the maxtor146g disk that is used in the .parv file.

    Anjo,

    For the SSD+HDD hybrid architecture, how can we configure the address space (sequnce for LBN ) for them ? For example, SSD's capacity is 10 G Bytes(20 M blocks), HDD 20 G Bytes(40 M blocks), I want to configure trace entries with blkno smaller than 20M to access SSD, trace entries with blkno between 20M-60M to access HDD, how can I implement this?

    Thank you,
    Cristi

    ReplyDelete
  14. Hi,

    the parameter is called "Storage capacity per device" in the disksim_synthgen section.

    In general you would not have a continuing lbn space, they would start both from 0. But the device would be different.

    I would suggest the following, have two blocks of disksim_synthgen, one with capacity=20GB and device hdd and the other with capacity=10GB and device ssd.

    In case you would only like to do it as you said, you need to change the implementation e.g. in disksim_ctlrsmart.c in which you decide where to send the event. There you could easily make the separation in ctlr_request_arrive and set the device and busno accordingly.

    Hope this helps...
    Anjo

    ReplyDelete
  15. Anjo,

    thanks again for your help. I managed to describe and test my hybrid system. Setting the stripe size to the size of the HDD allows me to use as many HDDs + one SSD (at the "end") and reference any of them from the trace file.

    Do you know of an SSD module documentation that would explain in some detail the parameters of the SDD used in disksim?

    Have a great weekend,
    Cristi

    ReplyDelete
  16. Hi Cristi,

    I'm glad to here from your success!

    Unfortunately I do not know any documentation of the SSD module besides the USENIX'08 paper (http://research.microsoft.com/pubs/63596/usenix-08-ssd.pdf).

    Hope this helps.
    Anjo

    ReplyDelete
  17. Hi,

    I am Andrew and i'm new in using disksim but your blog really helped me a lot. Now i made some raid5 simulations and i want to draw graphs from the output files but i don't know how. I tried with gnuplot but it only works for files with data in columns and in my .out files the distributions are in rows.

    Can you help me ? Thanks in advance
    Have a nice day

    ReplyDelete
  18. Hi Andrew,

    I wrote my own small analysis/draw tool using gnuplot and javagnuplot (java library interfacing gnuplot). Javagnuplot is very easy to use, you might want to have a look - similar libraries exist for perl and python...

    I think about publishing it in the next few days when it is complete (histograms (for average values) + distribution functions).

    Anjo

    ReplyDelete
  19. Hi Anjo

    I just wanna ask if the Hybrid Disksim will take a trace file for non-hybrid disksim ?

    I mean that the SSD extension only take a block number that is multiple of 8 !!!

    What about Hybrid Disksim ?

    Can we get around this problem ?

    Thaaanks

    ReplyDelete
  20. Hi,
    You can use traces which include disk-only, flash-only or hybrid accesses (addressable by device 0 for disk, 1 for flash). They have to comply with the normal disksim usage so everything has to be aligned by 8.

    What is the problem with aligning them to 8?

    Anjo

    ReplyDelete
  21. Hi Anjo

    I guess there is misunderstanding on my part about the alignment by 8.

    when using A real trace such as websearch, it has LBN & request size not divisible by 8, may be I do not understand what do you really mean by align them by 8 ?

    Do you mean round the LBN & Req Size ?
    if the answer is yes, does not this hurt the accuracy of the original trace ?

    I am trying to build a simulator that will take an original trace file generated by an HDD Only and feed it to the Hybrid SSD&HDD storage system in order to prove that a hybrid storage will run the trace faster ...

    example from original Trace File
    113.474738 0 12 3 0
    time stanpe, dev no, LBN, req size, 0 for READ

    Thanks

    ReplyDelete
  22. Hi,

    Just recalled a paragraph of the original ssd disksim paper(http://research.microsoft.com/pubs/63596/usenix-08-ssd.pdf) at the end of section 4.2 Workloads:

    All requests in this workload are for multiples of 8KB blocks. Alignment is important, since misaligned requests to flash add a page access to every read or write. Several of the logical volumes in our configuration were misaligned, yielding traces in which LBA mod 8 = 7 for all LBA. We corrected for this by post-processing the
    roughly 6.8M events in this trace.

    I assume that your described system has some controller which decides where to read the data from. It should generate only 8-aligned accesses to the SSD. Otherwise you'll have to write your own SSD simulator and plug it to disksim...

    Hope this helps,
    Anjo

    ReplyDelete
  23. Thank Anjo

    it really helped .. the req size issue is understood ..

    I have a further Question if you you will. would it be possible to create a mapping function that will map LBA to (block, page) in the SSD in order to place each page from the disk trace to its exact right location in the SSD.

    does this sound a right way to be implemented ?

    thanks again

    ReplyDelete
  24. Hi,
    I'm not sure if I understand your question...

    How you implement the mapping depends on what you would like to achieve. Usually the SSD space is considered smaller than the disk space such that you cannot have a direct mapping, which you were describing.
    If you would like to use the SSD as a cache, then you would have a table which specifies whether or not something is cached and where it is cached on SSD...

    Anjo

    ReplyDelete
  25. Hi anjo

    the last six words of your previous post is exactly what I mean "where it is cached on SSD".

    Does the SSD LBA has a FIXED location in SSD (Element, Plane , Block, page offset) ??

    in other word, if I issue an I/O request such as

    123.456789 0 50 0 .. does that mean this request will be reading a page from Element 0, Plane 0, Block 0, Page 49 ??? OR Not ?

    is this something I can control in my simulator or it is an FTL responsibility to decide where to read this page from ?

    thanks

    ReplyDelete
  26. Hi,

    My understanding is that the FTL decides on the actual location of a block. What is important for you is that a read of blkno 50 subsequent to a write of blkno 50 will return the previously written data.

    In your case you might wand to consider to have an abstraction layer in between or rewrite the SSD module.

    For your purpose it may make sense to look at the CacheDev files of disksim. They help you to use entire devices (like a SSD or HDD) for caching.

    Anjo

    ReplyDelete
  27. Hi Anjo
    First, i wanna thank you for your quick answer. I managed to draw the graphs in the end with a visual basic app(not very complex, but good). Now i would like to study a bit about the fail process of a hdd , in raid5, and the rebuild. As i saw, disksim lacks this part.

    Do you know if this can be accomplished with Disksim, or can you guide me to another simulator (i read about a very powerful tool - DASim but it's not to be found)? What are you using for this operation?

    thanks again

    ReplyDelete
  28. Hi Anjo

    I've ran your Hybrid Disksim Successfully on a trace.

    What I found puzzling is I got the SAME IOdriver response time avg from both your hybrid architecture & non-changes disksim ?

    I am assuming that your Hybrid will minimize the Avg response time cause of the SSD cache ?

    Thanks

    ReplyDelete
  29. Hi,

    Thanks for your interest.

    The SSD does not implement a caching behavior. If you want it to implement a disk block cache, you will have to change the disksim_ctlrsmar.c to have the behavior you want.

    This simulator just gives you the possibility to run an SSD and a disk in the simulator at the same time.

    Anjo

    ReplyDelete
  30. Thanks Anjo

    I guest the point I am trying to reach, if using the parv file above (as is) along with a trace file, what will be the default behavior of the ctrl ???

    ReplyDelete
  31. The controller basically forwards all requests of device 0 to the disk and all requests for device 1 to the SSD. Both have different caches and can be configured independently.

    ReplyDelete
  32. Hi Anjo
    I've compiled your Hybrid-disksim implementation successfully, But when i run "disksim example.parv k1 ascii 0 1", (without any change in example.parv file), SDD's access time and number of SSD's arrived requests are zero. In other word, all of random requests are sent to HDD!
    Do you have any idea about this?
    Thanks
    and Happy new year!

    ReplyDelete
  33. Hi Kaveh
    if you use the synthetic workload generation module (defined in the example.parv) you usually send the requests to a single part of the device. The example.parv defines at the end that this is the HDD. The current implementation does not have a shared lbn space (the space is devided by using a different device id (0=HDD, 1=SSD)).

    If you want to send requests to the SSD you have to change the device id and restrict the max lbn to the SSD size.

    What do you plan to implement or what behavior would you expect?

    Haapy new year!!!
    ANjo

    ReplyDelete
  34. Hi Anjo
    I'm Peter, a graduate student.
    I have some questions about the hybrid disksim.
    I want to apply hybrid disksim to run the trace file that my program generated. Can I assign what requests to Disk and what to SSD?

    Thank you!

    ReplyDelete
  35. Hi Peter,
    yes, use the device identifier. If setup like in the example file the disk will be device 0 and the ssd is device 1. I think in the ascii format the device id is the second parameter.
    Hope this helps, let me know if you have any further questions...

    Anjo

    ReplyDelete
  36. Hi Anjo
    I'm Peter. Thanks for your help.
    But I want to confirm something from your reply. Because I'm not sure what I think is correct.
    Whether the "device identifier" means that I can apply "disksim_iomap" to assign the device. Because I don't see the "disksim_iomap" in the "example.parv" file that you provided.

    Thank you very much!

    ReplyDelete
  37. Hi,
    I haven't worked with iomaps sofar in this context. Did you see any issues while using one?
    Anjo

    ReplyDelete
  38. Hi, Anjo
    I saw the iomaps in the "ascii.parv" file that is in the package of disksim 4.0. But it seems works with multiple disks not one disk and ssd.

    Excuse me! Would you explain what the device identifier is how to assign ID to the device. I am confused...

    Thank you!

    ReplyDelete
  39. Hi,
    Just from the way it is described in the DiskSim documentation and without trying anything, I think instead of diskX you should be able to use ssdX.

    When initiating your components like the following you give them names:
    # component instantiation
    instantiate [ statfoo ] as Stats
    instantiate [ bus0 .. bus2 ] as BUS0
    instantiate [ bus3 .. bus20 ] as BUS1
    instantiate [ disk0 .. disk17 ] as HP_C2249A
    instantiate [ ctlr0 .. ctlr19 ] as CTLR0
    instantiate [ driver0 ] as DRIVER0
    instantiate [ ssd0 .. ssd20 ] as SSD

    Then I do not see a reason why you shouldn't be able to use the ssdX identifier for disksim_iomaps. At least I have used the ssdX identifier also for synthetic trace generation.

    BTW. the iomapping is used to adopt a trace to a certain scenario without transforming the trace before hand. Here it is done at run time. You should be able to use a disk and an ssd without using iomap.

    Anjo

    ReplyDelete
  40. Dear Anjo,.

    Sorry for taking some of your time. I have tried to use the hybrid disksim simulator but face some problems. Would you please to help me to classify where is the “bug”? Thank you.

    Following describes my experimental steps.
    I created a simple trace file "example.trace" as follow:
    arr. time devNo. BlockNo. req size flag
    13.191135 0 0 5 1
    17.829285 1 24 6 1
    24.314779 1 4 2 1
    25.911729 0 2 12 0
    29.792530 1 8 15 1
    38.031193 1 12 32 1
    Finally, I run with the following command: "./disksim example.parv example.outv ascii example.trace 0"
    Unfortunately, the hybrid disksim printed the following error message.

    Assertion failed:
    simtime = 24.314779
    totalreqs = 3
    disksim: ssd.c:645: ssd_media_access_request_element: Assertion `tmp->bcount == currdisk->params.page_size' failed.
    Aborted

    Is the problem the block number should be aligned by 8 ?
    And the request size?

    Thank you for your kindly help.

    Peter

    ReplyDelete
  41. Yes, the SSD module from Microsoft requires alignment by 8.

    Another error, which would occur with your trace if you fix the first, is the request size. If I remember correctly then the request size has to be pages which usually means accessing 4KB (or 8 blocks). This restriction requires you to access in multiple of 8 blocks.

    Note: These restrictions are only the case for SSD requests. Requests going to the rotational disk module can be of any kind.

    Anjo

    ReplyDelete
  42. Dear Anjo,
    I still have some problems.
    When I run the trace file with 10 requests, the last request was not handled.

    I got the information by checking the output file.

    Overall I/O System Total Request Handled: 9

    Thank you

    Peter

    ReplyDelete
  43. Dear Peter,

    with this problem I can't help. The issue you mentioned rises also for a regular DiskSim. For some reason when you use a trace file as input a few of the last requests will not be handled.

    If you look at how disksim finishes the execution, you will see that a flag is set to stop it. For some reason in a trace-based simulation this flag is raised a little bit too early...

    Since missing the last request happens deterministically, it will be the case for all simulations. In case you have 100ks of request in a workload, missing the last one should not matter too much. In case you rely want to have the last request, add a fake you right behind it... Then the fake one will be missed.

    Another option is to look at the disksim mailing list, I think some time back it was discussed there too.

    Anjo

    ReplyDelete
  44. Dear Anjo,
    There was one message when I run the hybrid disksim.
    The message is "Stopping simulation because of saturation: simtime 413.282069, totalreqs 11283" .

    It seems related to the "MAX_QUEUE_LENGTH" defined in the disksim_logorg.c .

    Would you have any advice?

    Thank you.

    Peter

    ReplyDelete
  45. Dear Anjo,

    The trace file is like as follow that you can find it in the following link.

    https://skydrive.live.com/redir.aspx?cid=90c103e3ba8970d8&resid=90C103E3BA8970D8!152&parid=root

    And after I run it, there is the message "Stopping simulation because of saturation: simtime 413.282069, totalreqs 11283".

    Would you give me some advice?

    Thank you.

    Peter

    ReplyDelete
  46. Hi Peter,

    I only saw this type of failure, in case some disksim operation blocks one of the modules. E.g. the requests are queued, but never actually send to the next component. By printing which requests actually hit the diskctlr you could find the last one executing properly and by this track what happens to one of the data structures or where this requests blocks others...

    I'm sorry that I can't tell more, working on a deadline till May. Afterwards I may find some time to have a closer look at the actual trace file.

    Hope it helps...

    Anjo

    ReplyDelete
  47. Hi Anjo,
    Thank you for sharing your work online.
    I wonder whether I could configure a raid(0 and 5) over multiple of your hybrid HDD+SSD, and I also try to make a raid over multiple of hybrid ssds. I tried to write a parameter file, but I have some trouble on setting 'logorg's. A logorg requires device names, and it seems that I need to some 'abstract' device name for one hybrid disk(which is actually two disks). I am a newbie on the disksim, so what I say may not be understandable. In a word, could you possibly share your ideas how I configure a RAID-5 over your hybrid models?
    Thanks!

    ReplyDelete
  48. Hi Anjo&&Seo,
    So glad to find this topic you are chatting. I also have this puzzle that whether we could configure a ssd raid using ssdsim. If we can, how to do it? I am puzzled:-) Thanks a billion.

    ReplyDelete
  49. Hi,

    I haven't looked at the RAID implementation of disksim. From what you're saying (single name for logorg), I think that it would require a large implementation effort.

    What you could do instead which might be fast, is to run multiple disksim instances using the disksim_interface.h and implement on top of them a raid configuration.

    I think you have to change the interface implementation a little bit to run multiple disksim instances at the same time, but it is possible. (have done it before) The problem here is that the variable disksim is defined globally, so you have to set and reset it within the interface. I may upload my additions to the interface to make multiple disksim instances run at the same time after my deadline in early May.

    Anjo

    ReplyDelete
    Replies
    1. Anjo,
      Thanks for your deliberate reply. It gives me a good way to follow. Right now, I just wanna to use the hybrid disk to do my research. Maybe I need to use the raid sad and raid HDD in the future. If you have some creative ideas about this topic, could you let me know? Thanks again.
      My email is :liangxiongxu@gmail.com

      Delete
  50. Dear Anjo,

    I have some questions about your hybrid disksim.

    Does the hybrid disksim have some restriction?
    (1) request size(how many blocks) for each request?
    (2) block number for each request can't be bigger than?
    (3) arrival time interval between two request can't be
    smaller than?

    Thank you.

    Peter

    ReplyDelete
    Replies
    1. Hi,

      In general I would say there are no restrictions other than the onces introduced by the SSD module or disksim itself.

      To 1) For example requests to the SSD have to be aligned to 8 and the req. size has to be at least 8 blocks (this restriction is based on the SSD module implementation - which I'm using (from MSR))
      To 2) The largest block number depends on the size of the SSD. This you can configure in the .parv file. I think blocknumbers are long in disksim, so it should be enough to address a lot space
      To 3) I personally have never experimented with very small arrival times and I have no idea if disksim has any restrictions - not that I know of. (maybe you should ask this on the disksim mailing list, they probably know these things better than me)

      Anjo

      Delete
  51. Dear Anjo,

    When I ran the trace file that all requests are SSD's requests, I saw the average response time is always zero in the IOdriver Statistics section in the output file. Would you have any idea about this?

    Thank you.

    Peter

    ReplyDelete
  52. Dear Anjo,
    Thank your help before. Right now, I could do most of things. Just one thing I couldn't understand. Why aren't all traces processed? For example, we have 10000 traces, but only 7884 were processed.


    IODRIVER STATISTICS
    -------------------

    IOdriver Total Requests handled: 10000

    IOdriver Critical Reads: 4869 0.486900

    IOdriver Critical Writes: 5131 0.513100



    CONTROLLER STATISTICS
    ---------------------

    Controller #0

    Controller #0 cache requests: 10000
    Controller #0 cache read requests: 4869 0.4869
    Controller #0 cache atoms read: 63144 0.4951
    Controller #0 cache read misses: 4869 0.4869 1.0000
    Controller #0 cache read full hits: 0 0.0000 0.0000
    Controller #0 cache fills (read): 4869 0.4869 1.0000
    Controller #0 cache atom fills (read): 63144 0.4951 1.0000
    Controller #0 cache write requests: 5131 0.5131


    Controller #0 devices Number of reads: 4869 0.617580
    Controller #0 devices Number of writes: 3015 0.382420


    DISK STATISTICS
    ---------------

    Disk Total Requests handled: 7884
    Disk Requests per second: 151.415705
    Disk Completely idle time: 0.000000 0.000000
    Disk Response time average: 6.604335
    Disk Response time std.dev.: 3.264964
    Disk Response time maximum: 17.348610


    Here is the synthetic trace generation
    disksim_pf Proc {
    Number of processors = 2,
    Process-Flow Time Scale = 1.000000
    } # end of Proc spec

    disksim_synthio Synthio {
    Number of I/O requests to generate = 10000,
    Maximum time of trace generated = 100000000.000000,
    System call/return with each request = 0,
    Think time from call to request = 0.0,
    Think time from request to return = 0.0,
    Generators = [
    disksim_synthgen { #generator 0
    Storage capacity per device = 286749479,
    devices = [
    disk0
    ],
    Blocking factor = 8,
    Probability of sequential access = 0.5,
    Probability of local access = 0.0,
    Probability of read access = 0.5,
    Probability of time-critical request = 1.000000,
    Probability of time-limited request = 0.0,
    Time-limited think times = [
    normal,
    30.000000,
    100.000000
    ],
    General inter-arrival times = [
    exponential,
    0.0,
    0.0
    ],
    Sequential inter-arrival times = [
    normal,
    0.0,
    0.0
    ],
    Local inter-arrival times = [
    exponential,
    0.0,
    0.0
    ],
    Local distances = [
    normal,
    0.0,
    40000.000000
    ],
    Sizes = [
    exponential,
    0.0,
    8.000000
    ]
    }
    ]
    } # end of Synthio spec

    ReplyDelete
    Replies
    1. Hi,

      from the statistics it looks like the following:
      Controller #0 cache write requests: 5131 0.5131
      Controller #0 devices Number of writes: 3015 0.382420

      Which basically means that from the cache's 5000 requests only 3000 are passed to the device. How can this happen? 2 possible features of the cache:

      Coalescing: Since your workload shwos sequaentiality (50%), requests can be combined into one in case one ends where another starts. This is done since the cache usually buffers writes for a short amount of time and submits them only once the cache is full.

      Write-back Cache: Depending on how you configured the Write scheme (parameter in parf file) of the cache, the cache delays writes. At the end of a trace it can be that not all writes are actually persistent. Some of the writes may still be in the cache. This depends on the cache size. A 16MB cache e.g. can store about 4000 requests of size 8. It's more complex, but just a quick calculation.

      I think that it is a combination of the two points mentioned above. If it does not make sense to, feel free to ask more questions. Please include the complete .parv file and a complete listing of the statistics then.

      Thanks,
      Anjo

      Delete
  53. Dear Anjo,
    I am always following your answers for other people. So appreciate. And I wanna know why I could only read but not write from SSD(Hybrid ssd-hdd)?

    For example, I have 100 requests, half of which are read requests. When I run the hybrid simulator, only the read requests could be processed. The write requests seems be ignored directly.
    I don't know the reason. If you could give me some hints, that would be very helpful.

    Thank you so much.

    Leon

    ReplyDelete
    Replies
    1. Hi,

      From so far away I have no idea. I haven't worked with the simulator in a while, so currently no idea. My suggestion would be that you debug one access completely and see where it gets stuck. You may find a queue misconfigured or the cache works not well or so.

      Please let me know, if you find something... Sorry that I could be more helpful.

      Thanks,
      Anjo

      Delete
    2. Dear anjo,
      Thanks for your reply so quickly. I am so appreciated.
      I check my project but couldn't find the error. So I upload my project to my dropbox. Could you just check it for me? Thanks.
      I write down all the problems on the "readme at first.txt". It is too long, so I put them into one file according to your suggestion. The folder HybridDiskSim is the project I am using. And my host operating system is Ubuntu 12.04 LTS.
      link to dropbox:
      https://www.dropbox.com/sh/nhavksi2jx5zfdg/Y8OFYsO4ux
      Thanks again.

      Leon
      liangxiongxu@gmail.com

      Delete
    3. Hi,

      To 1) of the readme file:
      The statistics look fine to me . Why the trace stops or why no writes apear, I have no idea...
      To 2) of the readme file:
      Never tried to generate SSD and HDD requests at the same time. Always used external traces to send accesses.

      General comment:
      A master thesis of what you want to achieve seems to be available here:
      https://www.google.com/history/url?url=http://scholar.lib.vt.edu/theses/available/etd-12152008-144246/unrestricted/Pavan_Konanki_Fall08.pdf&ei=TkjXT-3pAo61-gai0434DA&sig2=DXlPChcsvC84aNheZplDsA&ct=w

      Also the current model does not support the write-back SSD cache which you're talking about. At least I have never tried to use the disksim caches on top of the SSD model. Instead of using this hybrid model, I would suggest building a system simulator instead. This simulator runs two different disksim instances at the same time, one SSD one HDD. You have to write the wrapper around it, basically the write-back cache on top of the SSD model. An example of how to program disksim in the system simulator model is in the src folder of any regular disksim.

      Please excuse that I couldn't be of more help...

      Delete
    4. Dear Anjo,
      Good morning. Your ideas enlighten me so much. Thanks. You gave me so much help:-)
      My work is much easier than the master thesis. Also, it doesn't relate to write-back cache(sorry for my misleading). I just wanna to use your hybrid simulator, which could run SSD simulator and HDD simulator at the same time. By this way, I could read/write data from/to SSD or HDD. Then I could get the statistic I/O performance of the hybrid simulator from the output file, mainly read or write average response time.

      I have two small questions. I think you are an expert in DiskSim. Sorry for bothering you too much.
      (1)
      So glad to know your statistics looks fine for my former 1st question. Could you tell me the type of your operating system? I am using Ubuntu 12.04LTS. Maybe I could try to install another OS.
      Furthermore, did you run my code in the dropbox or your own code? Could you tell me the difference between your code and my code so that I could modify it according to your instructions?

      (2) Could you run the following command for me to check the result? The result is weird when I ran it:-) They are five write requests to SSD. But the output file shows "No ssd requests encountered".

      ./src/disksim example_ssd.parv example.outv ascii example.trace 0

      10.829285 1 6307768 8 0
      27.191135 1 6357144 8 0
      49.792530 1 6357152 8 0
      64.314779 1 33701824 64 0
      85.911729 1 40000000 64 0


      Sorry to bother you too much......
      So appreciate........

      Delete
    5. Your explanation is very clear and reasonable. thanks.

      Delete
  54. Dear anjo,
    Here is my Email. You could send me email or facebook, if you want.

    liangxiongxu@gmail.com

    Thanks

    ReplyDelete
  55. Hi Anjo,
    Just like you said: "This simulator runs two different disksim instances at the same time, one SSD one HDD. You have to write the wrapper around it, basically the write-back cache on top of the SSD model. "
    My purpose is the use sad and hdd at the same time, but not raid. Thus, I could store the write requirement to SSD at first, and then move the virtual data to HDD if the SSD is full. As to the read request, I will read it from SSD at first, and I will read it from HDD later if I couldn't find it in SSD.

    So your suggestion is that I could use system simulator. Are SSD and Hdd independent in system simulator or not?
    How about your Hybrid simulator? Which one of them is more suitable for my requirement?
    Thanks.

    ReplyDelete
    Replies
    1. Hi,

      The system simulator is only an interface into disksim, so that you can write a program around disksim. It basically changes disksim from being driven by a trace into a program which you write which sends requests to disksim.

      I would suggest using two different disksim instances. If you look at how disksim is programmed you recognize that it uses a global variable disksim. The system simulator interface exports a local disksim variable, so that you can instanciate multiple disksims.

      If you have the ability to have multiple disksim instances, it does not make sense to use the hybrid simulator, because it basically transforms it into one. Which you use is up to you...

      Anjo

      Delete
  56. Hello Anjo!

    i am having problem running trace file Financial1.spc from http://traces.cs.umass.edu/index.php/Storage/Storage

    can you or anyone in here can work me through it i can run with synthetic but i cant seem to run external trace!
    thank you!

    ReplyDelete