Search This Blog

Saturday, November 10, 2012

Toaster device - installing wdm driver

I have been trying to understand how toaster wdm device works and apparently I spent so much time trying to install these drivers on my VM. First of all, its readme file is helpful but I found it lacking in some of its explanations. At first, I used devcon.exe to install bus driver but I bumped into a couple of issues so I could not install bus driver properly.
I searched online and found this msdn page where on the bottom it explains how to install toaster bus driver.

So that was helpful and on to the next issue: function driver. I could start up toaster by using enum.exe but it could not find some files that function driver was not installed. The message was not really helpful in that it does not say which files missing and part of it is that I do not fully understand what's needed for this install to happen. I could have spent time understanding installing package requirement but my goal was to understand Power management using this toaster before anything else.
Then, again from online I learned that I can look at setupapi.dev.log for more clue. By the way, this file is located in C:\Windows\inf directory.
Here is the error message I found from log file:
!!!  flq:                               Error installing file (0x00000002)
!!!  flq:                               Error 2: The system cannot find the file specified.
!    flq:                                    SourceFile   - 'c:\work\toaster\device\amd64\tostrco2.dll'
!    flq:                                    TargetFile   - 'C:\Users\ILHOYE~1.RED\AppData\Local\Temp\{6d5b7b46-959d-0823-c45e-094dc5a9816c}\amd64\tostrco2.dll'


By now, it is clear that I am missing tostrco2.dll. I don't know what file is for but I know that I need it. So I grabbed this from WDK and after that I could install toaster function driver. Now, I can happily debug toaster to understand the code flow.

How about traces? Is there any traces available with toaster driver? Yes, many of kernel drivers use either ETW or WPP tracing and hence by providing appropriate information you can turn on/off debug messages. These messages can be captured and saved to the file or you can actually see them if you have the debugger attached to your target machine/VM. In fact, toaster driver readme file describes the steps to do it but let me repeat that here. First of all, start the trace session by executing the following command where toaster.ctl contains "C56386BD-7C67-4264-B8D9-C4A53B93CBEB toaster"

c:\temp>tracelog -start toaster -rt -kd -ft 1 -guid toaster.ctl -flags 0xff

After that, in the kernel debugger you need to set the wmi path to refer to the TMF file location. What's TMF file? It is the file that contains information to translate the debug message to human readable strings. You can generate tmf file from pdb file by running 'tracepdb -f toaster.pdb' Here is how to set the path and enable debugging message.

kd>!wmitrace.searchpath + path_of_TMF_files
kd> !wmitrace.strdump
(WmiTracing)StrDump Generic
  LoggerContext Array @ 0x80BF1760 [64 Elements]
    Logger Id  2 @ 0x820C5000 Named 'MSDTC_TRACE_SESSION'
    Logger Id  3 @ 0x81AAF000 Named 'toaster'

kd> !wmitrace.enable 3
With that, you should be able to see the trace messages. Of course, you can always set the breakpoint where you are interested in to look into more details but knowing how to leverage existing traces should be helpful.

For toaster bus driver, if you want to see the debug messages, you will need to use chk build and use dbgview to enable kernel verbose debugging. However, once you turn on kernel verbose debugging, it will generate all sorts of debugging messages that you may not care about. Toaster bus driver uses DbgPrint for debug messages and that is essentially same as the following.

DbgPrintEx ( DPFLTR_DEFAULT_ID, DPFLTR_INFO_LEVEL, Format, arguments )

Therefore, we need to enable mask and level according to our mask and level. We can do this either updating registry or updating values via kernel debugger. For more information, please refer to MSDN page 'Reading and Filtering Debugging Messages' that describes how to enable certain component.


Sunday, October 14, 2012

Knowing what to practice

This week I read a book called "Talent is overrated". In the book author talks about how important it is to be deliberate in our practice to enhance our skills but what's even more important is to "KNOW" what to practice. Author said that many people do not have a clear picture of KNOWING what to practice. But those who accomplished knew exactly what they need to practice as they knew where they are lacking. I remember that my high school teacher said that the more you study, the more things that you see the needs to study. But if you do not study, you do not even know what to study. I think that it's a nice book to read and it does challenge the reader to be aware of one's status in terms of their achievement in their fields.


Just looking into where I am at right now, I feel that I do not really have a clear goal and I do not even know what to practice. Of course, there are many things out there I can/should study but my problem is that I do not see the needs so even though I may start out, I became lame and lose the passion to continually practice. I've got several books in front of me. C#, C++, Java Script and some OS/kernel books. I've looked at them to a degree but I've never mastered any of them. When I look at one subject, I feel like I should try another subject. That's pretty bad.

I hope and pray that I will be more consistent in my practice/study and be deliberate as the book suggests. I will keep posting my progress in my blog here so that I can keep track of my status. At this point, let me pick up one language and OS/kernel for the next one month. Perhaps I can write some simple apps like some command line tools, weather app, or stopwatch app. Hopefully, I will have a positive result in the next month.


Friday, October 12, 2012

String permutation with backtracking

This is probably one of classic interview questions. There are many solutions for this problem out there but today, a friend of mine mentioned this problem and I know that I solved this long time ago but my memory was fading so I decided to try this out.

As I thought about the problem, I decided to tackle this with backtracking. Basically the idea is that I take one character out from the initial word and mark that character to note that the character has been taken out for permutation. All along, I am passing just one character array and each time I reach the end, I simply print out the resulted string.

So let me show my code.
void perm_internal(string word, char *output, int n, int k)
{
    if (n == k) {
        cout << "[" << output << "]" << endl;
        return;
    }

    for (int i = 0; i < n; i++) {
        if (word[i] == 0) {
            continue;
        }
        char tmp = word[i];
        output[k] = tmp;
        word[i] = 0;
        perm_internal(word, output, n, k+1);
        word[i] = tmp;
    }
}

void perm(string word)
{
    int n = word.length();
    char *output;

    if (n == 0)
        return;
    output = (char *)malloc(n+1);
    memset(output, 0x0, n+1); 
    cout << "input word: " << word << "(" << n << ")" << endl;
    perm_internal(word, output, n, 0);
}


I want to try out different approaches to solve this problem but it's getting late so I will do that next time.

Wednesday, October 10, 2012

windbg init script

There are a couple of commands that I always run every time I start up windbg and it just dawns on me that perhaps it's time to put these commands to the script and have windbg execute it automatically.

So in this post, let me show you how to do that.
First, create a file that will contain all the commands that you would like to run.
For this example, let me create a file called 'dbg-prep.cmd'

C:\Users\ilhoye\Desktop\WinDbg> type dbg-prep.cmd
.symfix
.reload
.load mex
.load kdexts
aS !pr !process

Once we have this, we can just launch windbg with '-c' option. '-c' is a command to execute when windbg starts up but for our case, we want to execute several command and that is why I created a script in the first place.
To do that, we still use '-c' option but now this time we want to provide file path as follows.

windbg.exe -c "$$>< C:\Users\ilhoye\Desktop\WinDbg\dbg-prep.cmd

Please note that '-c' option needs to be quoted like the above.
Of course, it is cumbersome to type all these so it would be best to create a shortcut for this and in fact, as for me here is my shortcut command which also specifies the connection for kernel debugging.

"C:\Program Files\Debugging Tools for Windows (x64)\windbg.exe" -k 1394:channel=2 -c "$$>< C:\Users\ilhoye\Desktop\WinDbg\dbg-prep.cmd"

You can also add arguments to the script and for more information, please refer to msdn.

Monday, October 8, 2012

WINDBG: Setting breakpoints for user-mode process from kernel mode debugger

When working with kernel debugger, sometimes we may want to set a breakpoint in the user-mode. Can we do it? Yes, we can. :)

So in this post let me show you how to do that using notepad as an example.
First, let us connect to kernel debugger and in my case I use 1394 debugger connection. Once we are connected, look for a process that we want to set a breakpoint for.

0: kd> !process 0 0 notepad.exe
PROCESS fffffa8012256980
    SessionId: 1  Cid: 0990    Peb: 7f72f0bf000  ParentCid: 07ac
    DirBase: 1babb8000  ObjectTable: fffff8a007c7a040  HandleCount:  68.
    Image: notepad.exe


Once we have located the process that we are interested in. Follow these steps.

0: kd> .process /i fffffa8012256980
You need to continue execution (press 'g' <enter>) for the context
to be switched. When the debugger breaks in again, you will be in
the new process context.
0: kd> g
Break instruction exception - code 80000003 (first chance)
nt!DbgBreakPointWithStatus:
fffff802`46e8f930 cc

.process command will set a process context to notepad and '/i' option means that the target process is to be debugged invasively. In other words, once we execute this command, it prompts us to type 'g'. When we type 'g', it will set the target process to be active process and in this context, we can set a user-mode breakpoint.

For instance, let us set a breakpoint at NtCreateFile of ntdll but before we do that, we need to reload the symbols. This will not only reload kernel symbols but it will also reload user-mode symbols which we need to set a breakpoint for.

3: kd> .reload
Connected to Windows 8 9200 x64 target at (Mon Oct  8 18:10:21.107 2012 (UTC - 7:00)), ptr64 TRUE
Loading Kernel Symbols
...............................................................
................................................................
...................
Loading User Symbols
.........................
Loading unloaded module list
......
3: kd> bp /p fffffa8012256980 ntdll!ntcreatefile


Now, let us resume and this time the debugger should be able to break into the user-mode process.

3: kd> g
Breakpoint 3 hit
ntdll!ZwCreateFile:
0033:000007fb`891a30f0 4c8bd1          mov     r10,rcx
1: kd> kcn
 # Call Site
00 ntdll!ZwCreateFile
01 ntdll!LdrpNtCreateFileUnredirected
02 ntdll!LdrpMapResourceFile
03 ntdll!LdrLoadAlternateResourceModuleEx
04 ntdll!LdrpLoadResourceFromAlternativeModule
05 ntdll!LdrpSearchResourceSection_U
06 ntdll!LdrFindResource_U
07 KERNELBASE!FindResourceExW
08 COMDLG32!FindResourceExFallback
09 COMDLG32!FindResourceExMirrorFallback
0a COMDLG32!CFileOpenSave::_GetDialogTemplate
0b COMDLG32!CFileOpenSave::Show
0c notepad!ShowOpenSaveDialog


You can see from the above that the breakpoint was hit and the callstack is actually from the user-mode.
This technique can be used in many different places and one of the place could be when we want to break into certain function of the service process when it was being loaded. I guess that there are probably other places but this will definitely save you from some work of coordinating two different debuggers.

Thursday, October 4, 2012

Second Level Address Translation - EPT/NPT

This post is more or less to summarize what I have found out so that I can refer to it once I forget. (I know that I will forget this)

A few posts earlier I described how page table walk occurs and here let me briefly describe how that occurs with virtualization software such as Hyper-V. In this post. I will just describe the overall process without any debugger examples.

First of all, overall page walk with virtualization is similar to regular page table walk. Let me put out the regular page table walk diagram from the wiki page.



However, with virtualization things are a bit different. Keep in mind that whatever guest physical address that the OS thinks cannot be real physical address as hypervisor is the one that manipulates the real hardware. So what has to happen is another set of translations from the guest physical addresses to system physical addresses. Both Intel and AMD provides a solution to this address translation and they call these in two different names i.e. EPT and NPT but they are essentially the same thing.

So with guest physical addresses in our hand, we can traverse the similar data structures to obtain the system physical addresses. On Intel, these data structures are traversed via PML4 table - Page Directory Pointer table(PDPT) - Page Directory(PD) - Page Table(PT).

There are a couple of twists here to watch out though.

  • If bit 7 of the EPT PDPT entry is '1', the EPT PDPTE maps 1-Gbyte page. Otherwise, it maps to 2-Mbyte page.
  • For each entry of the table, we need to know the processor's physical-address width to obtain the physical address of the next table. 

We can get processor's address width by executing __cpuid with 0x80000008 in EAX and the the physical address width is returned in bits 7:0 of EAX. Well, that does not sound easy. Here is what I did. Just go to MSDN __cpuid page and copy the code and create a C++ source file and use that to obtain the value. On my machine, I got 36 so I know that my machine supports upto 36bit width.

So once we have the guest physical address and EPTP, it is just a matter of translating each address using the entry that we get to and the interpretation for each entry is subject to the tables given in chapter 28. VMX Support for Address Translation of Intel Manual.

In order to verify this page table walk, we need EPTP address and guest physical address but I have not found a way to obtain VMCS from the debugger easily. I will follow up on this if I find a way to obtain this pointer. But for now, everything is still in theory.

Thursday, August 23, 2012

Enable Remote Desktop from PowerShell/WMI

The other day I had to connect to my dev box at work from home but I realized that I did not enable remote desktop on my machine at work so I could not connect to it. However, I knew that I have a remote desktop access to another machine in my office so I connected to it and tried to see if there is any way I can enable remote desktop on my dev box via WMI interface using PowerShell. I looked up online and found bits of information here and I enabled remote desktop but somehow I could not still remote desktop to my dev box. When I came to work next day I learned that I also had to enable firewall setting for remote desktop service. So here is my post on how to enable remote desktop on Windows 8.
First, we need to keep our credential information so that we do not have to enter it every time we connect to the remote machine.
$cred = Get-Credential
Then, let us check if remote desktop is enabled. This will tell us if we need to enable it.
PS C:\temp> Invoke-Command -computername dev-pc -scriptblock {(gwmi -class win32_terminalservicesetting -namespace "root\cimv2\terminalservices").allowtsconnections} -Credential $cred
If we need to enable it, run the following command.
PS C:\temp> Invoke-Command -computername dev-pc -scriptblock {(gwmi -class win32_terminalservicesetting -namespace "root\cimv2\terminalservices").setallowtsconnections(1)} -Credential $cred
Then, run the following one to make sure that authentication is required for remote desktop access.
PS C:\temp> invoke-command -computername dev-pc -scriptblock {(gwmi -class win32_tsgeneralsetting -namespace "root\cimv2\terminalservices").setuserauthenticationrequired(1)} -Credential $cred
The above step basically enables Remote Desktop on my dev-pc but the trouble I had was that I still could not connect to my dev box because of firewall so we need to take care of that as well. To do that, we will need to run one more command.
PS C:\temp> invoke-command -computername dev-pc -scriptblock {netsh advfirewall firewall set rule group="remote desktop" new enable=Yes } -Credential $cred
Once we execute the last command, we are now able to remote desktop to remote computer.
Lastly, let me put two websites that I got the above scripts from to enable remote desktop.

Friday, June 15, 2012

Network Virtualization with SR-IOV in simple terms

It's exciting time for Hyper-V. There are so many new features with Windows 8 Hyper-V that it is worth mentioning. One of interesting new feature is SR-IOV which stands for "Single-Root I/O Virtualization". This feature is to offload all the work done by CPU to network card. Previously, Host OS had to process all the incoming/outgoing network packets which means that it required lots of CPU time just to process which VM the packet belongs to. However, with SR-IOV we can avoid that overhead!

Again, this is another cool feature that hardware brings into the virtualization space. From high-level point of view what this does is that the network card itself has virtual functions that act like ports. For regular network card, we just see one physical port as in the picture below.
However, SR-IOV network card has implemented virtual ports associated with physical port. Therefore, when the VM starts up, we can assign these virtual ports to the VM and from then on the VM can directly talk to network card. This way we do not have to consume all the CPU to figure out which VM the network packet belongs to. Instead we can cut down all the overhead and directly connect to the network card. Following is a diagram that I borrowed from MSDN.


Before I conclude today's post, let me introduce one more terminology that is equally important to get this feature working properly. That is IOMMU / Intel VT-d. When these network card interacts with the rest of the system, it either uses interrupt or DMA to either read from or write to memory location. With these virtual functions talking directly to VMs, it is also required that we process interrupts and memory access properly. What does this mean? For instance, when the network card wants to read the data from memory location that belongs to the VM, it accesses the memory location as if that is real physical address. However, that address might not be the right address for the VM as we have to share the memory address among all the VMs. Therefore, we need to translate the memory address and this is done by IOMMU. This is very similar to page table walk to translate virtual address to physical address. Yet, this is for virtual machines. Likewise, we have to map interrupts from these network device to the appropriate CPU by finding out which virtual port the interrupt is associated with and forwards it to the appropriate CPU.

I copied the following diagram from wiki page. Hopefully the idea will make sense with the picture below.


In summary, SR-IOV enables to offload all the processing work from CPU with the help from both network card and CPU feature.
  • Virtual Ports/VIrtual Functions from SR-IOV network card
  • DMA/Interrupt remapping from CPU

Monday, May 28, 2012

Virtual Machine Extensions - second post

Earlier I described briefly how the current virtualization technology works. Today let me try to explain little bit further on the same topic.

I showed the diagram that I got from Intel manual and basically that shows how VMM/Hypervisor interacts with Guest OS. In summary, it uses several instructions such as VM Exit or VM Entry to move in and out of Guest OS. To a degree that is very similar to how system call works. In other words, on 64-bit machine when we execute 'syscall' instruction, it causes the change to kernel mode and the kernel knows which system service the user mode application requested because the system service number is passed via EAX register. In the similar fashion, when we execute VM Exit, that causes the change to VMM/Hypervisor mode and VMM/Hypervisor does what's necessary to provide the guest OS virtualized environment.

If we look at disassembly of NtReadFile from ntdll.dll, we can see that it is calling 'syscall' as follows.

0:013> u ntdll!ntreadfile
ntdll!NtReadFile:
000007fe`747d2e40 4c8bd1          mov     r10,rcx
000007fe`747d2e43 b804000000      mov     eax,4
000007fe`747d2e48 0f05            syscall

Then the natural question would be what about virtualization? Do we also generate similar code for VM Exit or VM Entry? The answer is 'No'. The way it works is somewhat different and you can find the detailed information in chapters 25-27 of Intel Software Developer Manual but let me briefly describe how that works here.

Basically we tell processors that VMM/Hypervisor wants to gain the control when the guest OS executes certain instruction. Or we specify that when the guest OS touches certain parts of memory, we want to obtain the control so that we can provide appropriate information to the guest OS.
For instance, we might want to control the time inside guest OS and the way OS may obtain the time information is via TSC. So for this we want to gain the control when the guest OS executes 'rdtsc' or 'rdtscp' instruction. So essentially there is a certain data structure called virtual-machine control structures that the processor uses when it executes instructions and all we need to do is that we program this data structure. Once we do that the mode will be changed to VMM/Hypervisor when the given instruction is executed on the guest OS. Similarly, when we gain the control, processor provides us information with regard to the reason why VM Exit occured and the guest addresses at the time it was exiting so that we can use those information. So essentially this makes things a lot easier to create virtualized environment compared to binary patching technology.

Next, let me briefly describe another hardware support for virtualization.
That's the support for memory address translation. If you think about it, we cannot really give guest OS to control entire memory as that could mean that the given guest OS can view pages belonging to other guest OSs. Hence, we will have to control guest OS memory access in VMM/Hypervisor. How do we do that? Earlier we did this by using some sort of software page table that was hidden from the guest OS. What does this mean? This means that when the application in guest OS attempts to read from the memory, that address has to be translated to physical address as the address the application uses is virtual address. However, in our case the physical address that the guest wants to use is actually fake physical address from VMM/Hypervisor's point of view. Hence, we needed a way to provide real hardware physical address so that the guest OS can access to the correct memory address. This additional translation was done in software and its technology is normally called 'shadow paging'. Obviously, keeping up with all the hidden page tables and execute translation caused more memory consumption and slow execution compared to native execution case. So the hardware vendor such as Intel or AMD came up with the hardware support for this so that these are done hardware behind the scene so that the software does not have to concern this translation task. This technology is called extended page table (EPT) from Intel and nested page table (NPT) from AMD.

Here is the diagram that I stole from one of Intel presentation slides:  Intel Virtualization Technology Roadmap and VT-d Support in Xen. I think the diagram does a great illustrating the point. As you can see, there is now new EPT base register that points to EPT page table


In addition, hardware vendor added the Virtual Processor Identifiers (VPIDs) so that we do not throw out the TLB cache for those that belong to other processors as it makes sense that we want to keep these cache mapping data to improve the overall performance.

So that's all for now and here is the summary of this post:

  • How to configure processor to gain control back when the guest OS executes certain instructions
  • Hardware support for hidden page table to translate guest physical to machine physical address
  • Keeping the address mapping data by using virtual processor id

I hope that this was useful to those who stop by my page. Thank you.

Wednesday, May 16, 2012

How to debug debugger?

Yes, title says it all.
If you've ever tried to create your own debugger extension to ease the task of debugging your own program/kernel module, you probably wanted to know how to debug your own debugger extension as that is running within debugger.

The answer is pretty simple. You can use a couple of windbg commands to accomplish your task.
First, traditional '.dbgdbg' command. This will spawn CDB session and create remote session so you can either just use cdb session to debug or use windbg and connect to the remote session.
But as you can see, cdb session is not so friendly so I often start new windbg session to debug my own debugger extension. This works but requires a couple of steps.

Then, today I learned new command. That is '!debugme -lw'. What this will do is that it simply creates windbg session attached to your own debugger so it cuts out the extra CDB and remote. Very simple and nicely done!

So next time you need to debug your own debugger extension, try '!debugme -lw'

In this post I assumed that you know how to develop windbg debugger extension. Please refer to Advanced Windows Debugging for more information on that topic. The book has a nice example where you can probably just replace some of the code to create your own debugger extension.

Wednesday, April 18, 2012

Short introduction to Virtual Machine Extensions

Things have been changed greatly in the virtualization area. A few years ago, we had to emulate hardware by either binary patch at runtime or modify the guest OS. So not all the instruction was run natively and the goal was to run as much instruction as possible natively on the actual processors without the VMM intervention.

Nowadays, there is a better way to do virtualization. Namely, hardware support! Hardware vendors came up with many new features in CPU so that we do not have to do all the work in the software any more. At the heart of this technology is VMX in Intel and SVM in AMD. Here is high level overview of these technology. The goal is that from time to time VMM needs a way to gain control to feed different information to guest.
First, system software will have to register with processors that it is interested in gaining control when the guest OS executes such and such instruction. We do this by configuring virtual-machine control structure (VMCS).

Once this is set up and the guest OS actually executes instructions we registered, it will cause the transition from VMX guest operation to VMX root operation so that it can give processed information to the guest. This is called VM exits.

Following is a diagram from Intel Manual with regard to the interactions between VMM and VM guests.


There are some other ways to enter VMM mode but that's essentially how the virtualization with hardware support works in the core. Here I have skipped how the system software discovers VMX support in the processor and how it enables the features. For more information, please refer to Intel/AMD documentation. Hopefully, I will have more to say about those in the next posts.

Thursday, February 23, 2012

Address Translation Walk-through (Virtual to Physical)

Just recently I had a chance to read part of Windows Internals again and chat with my co-worker on the topic of Memory Management and then he showed me how to walk page tables to translate virtual address to physical address so let me share that experience here.

For this exercise, we will use the virtual address of Nt!NtCreateFile as an example and we will convert that address to physical address and compare what's there.
In this example, we use 32-bit windows PAE mode and we can check if windows is in PAE mode in several ways and one easy way to find out if we are running on PAE kernel is to look up where PDE is loaded. If you see that PDE is loaded at 0xC0600000, it is PAE kernel. Please refer to Free System Page Table Entries blog post for more info on it.

PAE mode has the following three level page tables so we need to keep this in mind as we translate.
  • Page Directory Pointer Table 
  • Page Directory 
  • Page Table
In addition, in PAE mode each entry is 8 byte size so we need to multiply the index value by eight. If it were non-PAE 32bit mode, we had only two levels and each entry is 4 byte long. With PAE mode on, we can address more memory because each PDE and PTE is 64-bit wide. Internally 32-bit system represents physical addresses with 24 bits so with PAE mode on we can fully maximize PFNs upto 2^24 and thus, it can support upto 64GB(2^24+12)


0. Locate virtual address of NtCreateFile
kd> x nt!ntcreatefile

8286f2a2          nt!NtCreateFile = no type information
kd> !pte nt!ntcreatefile
                    VA 8286f2a2
PDE at C06020A0            PTE at C0414378
contains 00000000001D0063  contains 000000000286F121
pfn 1d0       ---DA--KWEV  pfn 286f      -G--A--KREV


1. Find the current process
kd> !process -1 0
PROCESS 85b9fbb0  SessionId: 0  Cid: 03d0    Peb: 7ffd5000  ParentCid: 0238
    DirBase: 7ef5b1a0  ObjectTable: 8a585968  HandleCount: 594.
    Image: svchost.exe


 2. Check cr3 value for the location of PDPT. This should be same as DirBase from #1 result.
kd> r cr3
cr3=7ef5b080


3. Get binary representation of NtCreateFile virtual address
kd> .formats 0x8286f2a2
Evaluate expression:
  Hex:     8286f2a2
  Decimal: -2105085278
  Octal:   20241571242
  Binary:  10000010 10000110 11110010 10100010
  .. skip ...


 Let us split binary values as follows so that we can use them as an index to tables.
10(index to PDPT) 000010100(index to PDE) 001101111(index to PTE) 001010100010(offset)


4. Get to PDPT entry - 0y means it is binary number representation and l1 means only one entry (it is english alphabet 'l' not pipe '|')
kd> dd /p @cr3+0y10*8 l1
7ef5b090  1ad8b801


5. Check what's in the entry - only the first 20 bits are used and the rest are for flags
kd> .formats 1ad8b801
Evaluate expression:
  Hex:     1ad8b801
  Decimal: 450410497
  Octal:   03266134001
  Binary:  00011010 11011000 10111000 00000001
  ... skip ...


6. Get the address of PDE (Not necessary but this is one way to do hex value converting)
kd> ? 0y00011010110110001011
Evaluate expression: 109963 = 0001ad8b


7. Get to PDE for this particular address
kd> dd /p 0x0001ad8b * 0x1000 + 0y10100 * 8 |1
1ad8b0a0  001d0063


8. Get to PTE
kd> dd /p 0x1d0 * 0x1000 + 0y1101111 * 8 |1
001d0378  0286f121


9. Finally get the actual physical address and this time we do not need to multiply by 8
kd> up 0x286f000+0y1010100010
0286f2a2 8bff            mov     edi,edi
0286f2a4 55              push    ebp
0286f2a5 8bec            mov     ebp,esp
0286f2a7 51              push    ecx
0286f2a8 33c0            xor     eax,eax
0286f2aa 50              push    eax
0286f2ab 6a20            push    20h
0286f2ad 50              push    eax


10. Let us compare disassembly from virtual address

kd> uf nt!ntcreatefile
nt!NtCreateFile:
8286f2a2 8bff            mov     edi,edi
8286f2a4 55              push    ebp
8286f2a5 8bec            mov     ebp,esp
8286f2a7 51              push    ecx
8286f2a8 33c0            xor     eax,eax
8286f2aa 50              push    eax
8286f2ab 6a20            push    20h
8286f2ad 50              push    eax
8286f2ae 50              push    eax


They have the same content so we were able to translate the address. Cool!

Monday, January 2, 2012

Book - Inside the C++ Object Model

Over the Christmas holiday break, I read 'Inside the C++ Object Model' so here I would like to briefly comment on the book and summarize it.


This book is well recommended by people and the way I found about this book is from the old article by Don Box because I was reading 'Essential Com'. Hence, the book contains very unique material that describes the internals of C++. For instance, the book mainly deals with what exactly occurs when we use copy constructor and when/how we should use it. One important lesson I think I have learned from this book is not so much of technical details but the attitude to understand the internals of C++ to write better programs. The book describes how things are actually built and run when we write our C++ code. If we are not careful, it is very easy to create a program that would not perform very well. Of course, this book is pretty old so some stuff might not be appropriate any more but it still does give us very good insight into C++. One more thing I would like to mention if you want to read this book is that the book has some number of typos and wrong diagrams so be prepare for that because I spent hours to figure out the code but it turned out that there was typo. I tried to find errata site for this book but I could not find on the web so please take note of that.

With that then, I would like to illustrate a couple of things from the chapter 2 of the book.
First, constructor. The author mentions that programmer new to C++ often have two common misunderstandings:

  1. That a default constructor is synthesized for every class that does not define one.
  2. That the compiler-synthesized default constructor provides explicit default initializers for each data member declared within the class.

Compiler only generates a default constructor under certain circumstances and it is the programmer's job to explicitly initialize data members if needed. For instance, compiler would generate default constructor if the class contains data members that has default constructor and hence it can call default constructor for that member. Compiler also generate default constructor if the class either declares (or inherits) a virtual function so that it takes care of virtual function table and pointers hold in the table. So lessons we can learn from this is that we need to take care of initialization if necessary and not rely on default constructor. In addition, a great deal happens when we define 'T object;' so keep that in mind.

Second, temporary objects. Understand that temporary objects are created at times. For example, look at the following code.

X bar()
{
    X xx;
    // Process xx...
    return xx;
}

Here, return value is copy constructed from its local object xx. So compiler would generate the code as follows.

void
bar (X& __result )
{
    X xx;

    // compiler generated invocation
    // of default constructor
    xx.X::X();

    // Process xx...

    // compiler generated invocation
    // of copy constructor
   __result.X::X( xx );

    return;
}

Then, the author shows two ways to optimize the code above and the following is what can be done by compiler.


void
bar (X& __result )
{
    // compiler generated invocation
    // of copy constructor
   __result.X::X();

    // Process in __result directly

    return;
}


Although this looks great but there could be issues with the code above. For instance, if we have instrumented the code to do something with the destructor of local objects, then now we do not have that option since we removed the destructor. So we need to be mindful of our code and what is actually taking place.

The book has plenty of points and illustrations of internals of C++ so maybe I will try to summarize the important points in my blog again to remind myself in the future but basically as you can see, if you are interested in these kind of insight, give it a try. After all, the book is not so thick. ;-)