Let us slow down

Wednesday, May 16, 2012

How to debug debugger?

Yes, title says it all.
If you've ever tried to create your own debugger extension to ease the task of debugging your own program/kernel module, you probably wanted to know how to debug your own debugger extension as that is running within debugger.

The answer is pretty simple. You can use a couple of windbg commands to accomplish your task.
First, traditional '.dbgdbg' command. This will spawn CDB session and create remote session so you can either just use cdb session to debug or use windbg and connect to the remote session.
But as you can see, cdb session is not so friendly so I often start new windbg session to debug my own debugger extension. This works but requires a couple of steps.

Then, today I learned new command. That is '!debugme -lw'. What this will do is that it simply creates windbg session attached to your own debugger so it cuts out the extra CDB and remote. Very simple and nicely done!

So next time you need to debug your own debugger extension, try '!debugme -lw'

In this post I assumed that you know how to develop windbg debugger extension. Please refer to Advanced Windows Debugging for more information on that topic. The book has a nice example where you can probably just replace some of the code to create your own debugger extension.

Wednesday, April 18, 2012

Short introduction to Virtual Machine Extensions

Things have been changed greatly in the virtualization area. A few years ago, we had to emulate hardware by either binary patch at runtime or modify the guest OS. So not all the instruction was run natively and the goal was to run as much instruction as possible natively on the actual processors without the VMM intervention.

Nowadays, there is a better way to do virtualization. Namely, hardware support! Hardware vendors came up with many new features in CPU so that we do not have to do all the work in the software any more. At the heart of this technology is VMX in Intel and SVM in AMD. Here is high level overview of these technology. The goal is that from time to time VMM needs a way to gain control to feed different information to guest.
First, system software will have to register with processors that it is interested in gaining control when the guest OS executes such and such instruction. We do this by configuring virtual-machine control structure (VMCS).

Once this is set up and the guest OS actually executes instructions we registered, it will cause the transition from VMX guest operation to VMX root operation so that it can give processed information to the guest. This is called VM exits.

Following is a diagram from Intel Manual with regard to the interactions between VMM and VM guests.

There are some other ways to enter VMM mode but that's essentially how the virtualization with hardware support works in the core. Here I have skipped how the system software discovers VMX support in the processor and how it enables the features. For more information, please refer to Intel/AMD documentation. Hopefully, I will have more to say about those in the next posts.

Thursday, February 23, 2012

Address Translation Walk-through (Virtual to Physical)

Just recently I had a chance to read part of Windows Internals again and chat with my co-worker on the topic of Memory Management and then he showed me how to walk page tables to translate virtual address to physical address so let me share that experience here.

For this exercise, we will use the virtual address of Nt!NtCreateFile as an example and we will convert that address to physical address and compare what's there.
In this example, we use 32-bit windows PAE mode and we can check if windows is in PAE mode in several ways and one easy way to find out if we are running on PAE kernel is to look up where PDE is loaded. If you see that PDE is loaded at 0xC0600000, it is PAE kernel. Please refer to Free System Page Table Entries blog post for more info on it.

PAE mode has the following three level page tables so we need to keep this in mind as we translate.

Page Directory Pointer Table
Page Directory
Page Table

In addition, in PAE mode each entry is 8 byte size so we need to multiply the index value by eight. If it were non-PAE 32bit mode, we had only two levels and each entry is 4 byte long. With PAE mode on, we can address more memory because each PDE and PTE is 64-bit wide. Internally 32-bit system represents physical addresses with 24 bits so with PAE mode on we can fully maximize PFNs upto 2^24 and thus, it can support upto 64GB(2^24+12)

0. Locate virtual address of NtCreateFile
kd> x nt!ntcreatefile

8286f2a2 nt!NtCreateFile = no type information

kd> !pte nt!ntcreatefile

VA 8286f2a2
PDE at C06020A0 PTE at C0414378
contains 00000000001D0063 contains 000000000286F121
pfn 1d0 ---DA--KWEV pfn 286f -G--A--KREV

1. Find the current process
kd> !process -1 0
PROCESS 85b9fbb0 SessionId: 0 Cid: 03d0 Peb: 7ffd5000 ParentCid: 0238
DirBase: 7ef5b1a0 ObjectTable: 8a585968 HandleCount: 594.
Image: svchost.exe

2. Check cr3 value for the location of PDPT. This should be same as DirBase from #1 result.
kd> r cr3
cr3=7ef5b080

3. Get binary representation of NtCreateFile virtual address
kd> .formats 0x8286f2a2
Evaluate expression:
Hex: 8286f2a2
Decimal: -2105085278
Octal: 20241571242
Binary: 10000010 10000110 11110010 10100010
.. skip ...

Let us split binary values as follows so that we can use them as an index to tables.
10(index to PDPT) 000010100(index to PDE) 001101111(index to PTE) 001010100010(offset)

4. Get to PDPT entry - 0y means it is binary number representation and l1 means only one entry (it is english alphabet 'l' not pipe '|')
kd> dd /p @cr3+0y10*8 l1
7ef5b090 1ad8b801

5. Check what's in the entry - only the first 20 bits are used and the rest are for flags
kd> .formats 1ad8b801
Evaluate expression:
Hex: 1ad8b801
Decimal: 450410497
Octal: 03266134001
Binary: 00011010 11011000 10111000 00000001
... skip ...

6. Get the address of PDE (Not necessary but this is one way to do hex value converting)
kd> ? 0y00011010110110001011
Evaluate expression: 109963 = 0001ad8b

7. Get to PDE for this particular address
kd> dd /p 0x0001ad8b * 0x1000 + 0y10100 * 8 |1
1ad8b0a0 001d0063

8. Get to PTE
kd> dd /p 0x1d0 * 0x1000 + 0y1101111 * 8 |1
001d0378 0286f121

9. Finally get the actual physical address and this time we do not need to multiply by 8
kd> up 0x286f000+0y1010100010
0286f2a2 8bff mov edi,edi
0286f2a4 55 push ebp
0286f2a5 8bec mov ebp,esp
0286f2a7 51 push ecx
0286f2a8 33c0 xor eax,eax
0286f2aa 50 push eax
0286f2ab 6a20 push 20h
0286f2ad 50 push eax

10. Let us compare disassembly from virtual address

kd> uf nt!ntcreatefile
nt!NtCreateFile:
8286f2a2 8bff mov edi,edi
8286f2a4 55 push ebp
8286f2a5 8bec mov ebp,esp
8286f2a7 51 push ecx
8286f2a8 33c0 xor eax,eax
8286f2aa 50 push eax
8286f2ab 6a20 push 20h
8286f2ad 50 push eax
8286f2ae 50 push eax

They have the same content so we were able to translate the address. Cool!

Monday, January 2, 2012

Book - Inside the C++ Object Model

Over the Christmas holiday break, I read 'Inside the C++ Object Model' so here I would like to briefly comment on the book and summarize it.

This book is well recommended by people and the way I found about this book is from the old article by Don Box because I was reading 'Essential Com'. Hence, the book contains very unique material that describes the internals of C++. For instance, the book mainly deals with what exactly occurs when we use copy constructor and when/how we should use it. One important lesson I think I have learned from this book is not so much of technical details but the attitude to understand the internals of C++ to write better programs. The book describes how things are actually built and run when we write our C++ code. If we are not careful, it is very easy to create a program that would not perform very well. Of course, this book is pretty old so some stuff might not be appropriate any more but it still does give us very good insight into C++. One more thing I would like to mention if you want to read this book is that the book has some number of typos and wrong diagrams so be prepare for that because I spent hours to figure out the code but it turned out that there was typo. I tried to find errata site for this book but I could not find on the web so please take note of that.

With that then, I would like to illustrate a couple of things from the chapter 2 of the book.
First, constructor. The author mentions that programmer new to C++ often have two common misunderstandings:

That a default constructor is synthesized for every class that does not define one.
That the compiler-synthesized default constructor provides explicit default initializers for each data member declared within the class.

Compiler only generates a default constructor under certain circumstances and it is the programmer's job to explicitly initialize data members if needed. For instance, compiler would generate default constructor if the class contains data members that has default constructor and hence it can call default constructor for that member. Compiler also generate default constructor if the class either declares (or inherits) a virtual function so that it takes care of virtual function table and pointers hold in the table. So lessons we can learn from this is that we need to take care of initialization if necessary and not rely on default constructor. In addition, a great deal happens when we define 'T object;' so keep that in mind.

Second, temporary objects. Understand that temporary objects are created at times. For example, look at the following code.

X bar()
{
X xx;
// Process xx...
return xx;
}

Here, return value is copy constructed from its local object xx. So compiler would generate the code as follows.

void
bar (X& __result )
{
X xx;

// compiler generated invocation
// of default constructor
xx.X::X();

// Process xx...

// compiler generated invocation
// of copy constructor
__result.X::X( xx );

return;
}

Then, the author shows two ways to optimize the code above and the following is what can be done by compiler.

void
bar (X& __result )
{
// compiler generated invocation
// of copy constructor
__result.X::X();

// Process in __result directly

return;
}

Although this looks great but there could be issues with the code above. For instance, if we have instrumented the code to do something with the destructor of local objects, then now we do not have that option since we removed the destructor. So we need to be mindful of our code and what is actually taking place.

The book has plenty of points and illustrations of internals of C++ so maybe I will try to summarize the important points in my blog again to remind myself in the future but basically as you can see, if you are interested in these kind of insight, give it a try. After all, the book is not so thick. ;-)

Thursday, December 8, 2011

Powershell goodies

I posted a blog last time about my attempt to learn PowerShell again and I found that it is quite interesting to learn this new scripting language on Windows. Yes, there are many scripting languages available on other platforms but on Windows none of them really works great in my opinion not because they are bad but because they are not really meant for Windows. I guess we can argue about that all day but in any case I would like show another way to increase your PowerShell knowledge. That is PowerShell Code Repository! This web site has many many interesting scripts that can be used right away.

For instance, for years every time I wanted to list only directories in my current location, I had to search online to figure out how to do it or run 'dir | more' and look for myself. Then, as I began to learn PowerShell, I found that I can do this with 'dir | where-object { $_.psiscontainer -eq '$true' }' but that is too long to type.
Then I found a script that will reduce my keyboard typing effort. That is 'Compare-Property' function from PowerShell Code Repository. With this function, I can do the same task in the following manner.

PS >Set-Alias ?? Compare-Property
PS >dir | ?? PsIsContainer

Pretty cool. So all you have to do is to go the site and see if you have any scripts that you want to use. Then download them and keep them in one specific folder. After that, include that folder in your $profile as follows.

$env:path = $env:path + ";c:\scripts"

Once you do that, you can just use script name 'Compare-Property' like the above example as if it is regular PowerShell regular cmdlets. So that's all for now and happy PowerShell!

Sunday, November 20, 2011

Powershell works differently

For years I tried to pick up powershell but everytime I tried, I miserably failed to feel comfortable about it. But I felt strongly that I need to learn some scripting language to automate some works on Windows so I decided to give it a try one more time.

This blog is by no means to say that I have mastered it but I think that I finally understood why I could not learn the powershell. I think that is because I approached powershell from my bash scripting experience. What do I mean by it? When we use bash shell, we work with string but when we work with powershell, that does not work well. Powershell always returns objects not string and hence when we want to process output of certain output, it is necessary that we approach it from this perspective. So let me illustrate this.

Often times we want to list something and then 'grep' certain words from the list output. For instance, let us try 'alias' command on powershell.

Alias command produces lots of output that its output is more than a page. So probably I want to see if alias output has the line that contains the word 'content' in some easy way. Readily I think of 'grep' command and I remember that select-string works somewhat similarly. So let us try Select-String.

Hmm. Nothing... But I know that there is a line that has the word 'content' What's going on here? Apparently, Select-String is only looking for words to match from Name column. So let us now try 'asnp' to verify it.

Ok, that is better but I want to get the entire line. What should I do?
This is where I realized that powershell works on objects not string. In principle, powershell operates on objects and hence when the powershell returns the output for alias command, it actually returned the list of objects. So if we want to get the whole line which in this case is an object, we want to try to match the entire object. Here is the trick.

Yes, we got what we wanted. Again, this is because we were looking for object where its definition contains 'content' and it returns the list of those object.

In summary, powershell operates on objects so when you use powershell, keep that in mind!

Thursday, October 20, 2011

Increase the productivity with raid0

Yesterday I was busy setting up my machine at work and one of my co-workers shared the tip to increase the machine throughput so let me share here with all of you.
Every time we build the projects or do some IO intensive works, things get slower because of the bottleneck with the disks. Hence if we can avoid writing to/reading from the disk, the productivity increases and there are numerous examples out there with this idea and probably one of well-known example is memcache.

So what can we do about disk IO? If we have to write to the disk in any way, there is a way to improve the situation with raid0. Normally whenever we think about the raid, we tend to think about redundancy but raid0 is not so much about that. What raid0 is doing is striping. According to wiki page, it is defined as follows.
A RAID 0 (also known as a stripe set or striped volume) splits data evenly across two or more disks (striped) with no parityinformation for redundancy.

In my machine, I happen to have two additional disks and hence my co-worker suggested me to use raid 0 to boost the IO and that is what I did. So how did I do it? It turned out that it is fairly easy to do it on Windows. Here is the steps.
First let us fire up 'Computer Management' by first going to 'Control Panel' and selecting 'Administrative Tools' and finally choosing 'Computer Management' icon. Once you have it running, click 'Disk Management' under 'Storage'. When you do that, you will see all your disks and its partitions and there choose two disks that you want to use for raid0 and right click mouse and you will see the menu as follows.

Choose 'New Striped Volume...' and when you do that, you will be presented a couple of windows to configure the raid0 such as allocating the size of the disk for raid0.
Once you do that, you are done. Yes, I know it is too simple. But that's all it takes. Now when you look at the 'Disk Management' console, you will see something like the following.

I created the raid0 and label it as E: drive and hence I put my source code there so that all the IO work happens there to isolate the IO from other OS or app activities. So if you are a windows user and have some extra disks on your machine, give it a try!

Search This Blog