Sunday, 30 August 2009

Assembly refresher

    1. The concept
      1. It’s a collection of one or more files containing type definitions and resource files like images, xml files etc used in your program.
      2. It also contains something called Manifest
        1. A manifest is a set of metadata tables that contains names of the files that are part of the assembly.
        2. It contains the metadata information about the assembly like its version, culture, publisher etc.
      1. It’s the smallest unit of deployment in .NET because CLR operates on Assembly.
      2. CLR always loads the file that contains the manifest metadata tables first and then uses it get the names of the files that are part of the assembly.
      3. You can have a multi-file assembly one of which can contain types that are frequently used, and the other ones containing types that are less likely to be used. This works well for programs that work over the internet. They only need to download the parts of the assembly that is mostly used. Other parts will be downloaded on demand.
        1. Its job of the CLR to load the parts of the assembly whenever required - locally or over the Internet, the address is present.
        2. You can use the multi-file assembly feature to
          1. add resource or data files to your assembly or
          2. you can also use it in case types are implemented using different programming languages, and combine them to produce a single assembly.
      1. The problem is that VS doesn't provide you the feature to create multi-file assembly, you will have to use command-line utilities like csc.exe
        1. All you do is create multiple .netmodules using csc.exe /t:module, and combine them using /addmodule switch.
        2. Lets have two C# files: Apple.cs and Mango.cs; do the following:
          1. csc /t:module Apple.cs [creates a .netmodule]
          2. csc /out:MyTypes.dll /t:library /addmodule:Apple.netmodule Mango.cs [creates an assembly]
        1. The resultant assembly will have two managed modules and one manifest
    2. Types of assemblies
      1. There are two types of assemblies in .NET:
        1. Weakly named, and
        2. Strongly named
      2. The fundamental difference between the two is that the strongly named assembly is signed with a publisher's public/private key pair that uniquely identifies the publisher.
      3. A strongly named assembly can be deployed either privately or publically. The GAC is the repository where CLR looks out for strong named public assemblies.
      4. Having a central repository means that the assemblies need more than just a filename to be distinguished. The following shows the Assembly folder before and after its exposed using the command: regsvr32 D:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\shfusion.dll -u

    AssemblyFolder AssemblyFolderExposed

Iterative and Evolutionary development - Heads up

  • In this lifecycle approach, the development is organized into a series of short, fixed length mini projects called Iterations.
  • The resultant of each iteration is a subset of the actual production level software. Its neither a prototype nor an experimental throw away.
  • The software should grow with each iteration until it unifies with the expected end result and that's why its called Iterative and Incremental Development.
  • Its called Iterative and Evolutionary because feedbacks and adaptation evolves the specification and design.
  • Each iteration follows a model of Requirements analysis, followed by Design, followed by Implementation and test and finally complete system Integration test.
  • How changes are handled in such an iterative approach?
    • The only thing that's permanent is Change its inevitable, so there is no point confronting it - embrace change instead.
    • Having said that, the iterative development doesn't encourage or invoke sudden changes, it tries to ease the process of change.
    • The early iterations can help in getting quick feedback from the business user, developer and tester - this minimizes speculation, and helps reaping the actual requirements and validating the path of development.
    • Its easy to understand that work proceeds through a series of build-feedback-adapt cycles. The early iterations will no doubt, have some deviations from the actual requirements or the true path, but it will align itself to the true path later on.
  • There are multiple benefits of the Iterative Development:
    • Projects are less likely to fail.
    • Early rather than late mitigation of risks like technical, requirements, objectives, usability etc.,
    • Visible progress
    • Early feedbacks, user engagement, and adaptation leads to a refined and acceptable system
    • Complexity is managed - the team is saved from analysis-paralysis or very long and complex steps - they have portions in their plate that they can eat easily.
  • The length of an iteration is equally important - it doesn't make sense to have a 6-week iteration in a 2 month long project.
    • Short is good
    • Small steps, rapid feedback and adaptation are at the core of Iterative development.
    • A key idea is that whatever be the best length it has to be timeboxed - fixed. The partial system must be integrated, tested, and stabilized by the scheduled date - no exceptions at that.

Analysis and Design - Heads up

  • Analysis emphasizes on investigation of the problem and requirements. It's a broad term, so we will elaborate on the key ones:
    • Requirements analysis
      • Investigates requirements
    • Object oriented analysis
      • Investigates domain objects
  • During the analysis of the problem and its requirements when a project is started the points to discuss are:
    • How is it going to be used?
    • What problems it solves?
    • What are its functions?
  • During an object oriented analysis the focus is on finding and describing the objects or concepts in the problem domain. e.g. Department, Designation and Role in a Payroll system.
  • Design emphasizes on a conceptual solution that is supposed to fulfill the requirements and doesn't care about implementation yet.
    • A description of a database schema and domain objects
  • During object oriented design the focus is on defining the objects or concepts and figuring out how they collaborate with each other to fulfill a requirement. e.g. a Department may have many designations, and a designation may have different roles.
  • Object oriented programming follows analysis and design.

What seasoned Architects say about the craft

  • I cannot stress this enough: Keep working to clarify your understanding of the desired solution. If you come from a development background, your disposition is to start working on a solution from the moment that the problem is stated. The effect that this has is that the rest of the conversation sometimes gets tuned out while your mind turns to solving the problem that you have heard. It is important to resist this urge and pay attention, so that you can understand what the person is saying. Ask questions, and challenge the things that you do not understand.
  • In India, the general answer was always immediately, "Yes, it can be done," and then the developers would go off to huddle and try to figure out what they thought was being asked for in the solution. In Eastern Europe, every request was responded to with a barrage of questions, and unclear ideas were challenged until everyone had a clear understanding of the desired solution.
  • Communications skills are the skills that I work on the most, because they are the most critical for an architect. Without top communications skills, it's going to be difficult to get your vision into the heads of others. The highest responsibility of an architect is to communicate the solution to the business leaders, technical leaders, and any other interested parties in a language that they understand.
  • Architects must have a mastery of three languages: business language, for communicating with the business people; industry language, for communicating in the vernacular of the vertical; and technical language, for communicating to the technical leadership and the developers.
  • You must become familiar with the domain in which you will be developing a solution. I believe that there is a tremendous benefit to having experience in multiple industries and domains. This allows you to be like a bumble-bee and cross-pollinate the best ideas across industries. I have found also that the DNA of most architects contains a hunger for knowledge in a broad range of subject areas. Architects are interested in knowing about things and understanding how things work. Architects can then synthesize this knowledge into creating solutions.
  • As architects, our minds can be far into the future (as they should be), but we also should not assume that the rest of the team can see that far. We must communicate, educate, and mentor the team to our level of understanding, so that they can understand the full vision. Be like a U.S. Marine: Leave no one behind.
  • I wear a Microsoft Xbox 360 wristband that reads, "Challenge Me"—a statement that I think gets to the heart of what drives a solutions architect. IT solutions architecture is a constantly evolving field, with an ever-increasing set of new challenges. Albert Einstein once said, "You cannot solve tomorrow's problems with today's level of thinking." We must always be striving to take it to the next level. This has been what has driven me for the past 25 years in IT, and will continue to drive me into the future.
  • The DNA of an architect is to strive to understand the problem, envision the solution, and then communicate the vision to the folks who will implement the solution. Educate and mentor your team, so that everyone comes away with the same vision.

     
     

    Critical-Thinking Questions

    • Are you listening to the customer? I mean, really listening to the customer? Are you asking questions to clarify the problem? Are you willing to stay in the envisioning phase until everything is clear?
    • What are you going to do today, to sharpen your communication skills?
    • Are you stepping back to see the big picture? How far into the future do you see this vision? Can you communicate this vision?
    • Are you taking the time to bring the team along? Are you making sure that your team understands the vision?

       
       

       
       

    Source:

    http://msdn.microsoft.com/en-us/library/bb447671.aspx

Dissecting ‘Hello World’

Look at the following piece the age old Hello World program in C#.

clip_image001

The above piece of code can be explained as having

  1. Defined a type called Program having a single public static method called Main
  2. Main refers to a type called System.Console which is written by Microsoft and the IL code that implements it is in a file called MsCorLib.dll
  3. To compile the program you could have executed the following statement:
    1. csc.exe /out:Program.exe /t[arget]:exe /r[eference]:MSCorLib.dll Program.cs
  4. But, the following statement also works:
    1. csc.exe Program.cs
  5. Reason being the fact that the compiler has some defaults and reference to MsCorLib.dll is one such default. Please note that MsCorLib.dll is the most critical of all .Net BCLs. It contains all the core types like String, Int32 etc, exe is another default.
  6. However, if you want to beak it try the following:
    1. csc.exe /nostdlib Program.cs
    2. /nostdlib asks the compiler to ignore MsCorLib.dll. If this switch is breaking a simple statement then why is it useful, you might ask.
    3. The reason is simple when the compiler compiles MsCorLib.dll is doesn’t need a reference to MsCorLib.dll does it? It uses the /nostdlib then.

As a result of the above command, a simple executable file called Program.exe is now created. Lets dive a little deeper into it.

  1. It’s a PE (Portable Executable) having the following:
    1. PE header(32/32+)
    2. CLR header
    3. Metadata
      1. To examine the Metadata within a managed PE file use ILDasm.exe
    4. IL
  2. It's also an Assembly.

See you soon!

Saturday, 29 August 2009

A quick note on IL

  • IL is a CPU-independent machine language created by Microsoft.
  • It is a much higher level language than most CPU machine languages. Its part of the managed module that compiler creates out of source code.
  • It can access and manipulate object types and has instructions to create and initialize objects, call virtual methods on objects, and manipulate array elements directly. It even has instructions to throw and catch exceptions for error handling.
  • IL can be written in Assembly language.
  • Bear in mind that any high level language, most likely, will expose only a subset of facilities provided by the CLR. However, the IL assembly language will expose all the facilities provided by CLR. So, in case you want to use any CLR facility your preferred language doesn't expose you have two options:
    • Use IL to code the desired part
    • Use any other CLR language that exposes the missing facility.
  • This is an amazing feature provided by the CLR where you can use specialized language to perform a specific task in your project. Often overlooked, but wouldn't it be great to perform normal operations like I/O in C#/VB.Net and leverage APL for engineering calculations?

Understanding Metadata

  • In addition to emitting IL, every compiler targeting the CLR is required to emit full metadata into every managed module.
  • The metadata is a set of data tables that describe what is defined in the module, such as types and their members.
  • It also has tables indicating what the managed module references, such as imported types and their members.
  • Metadata is a superset of older technologies such as Type Libraries and Interface Definition Language (IDL) files. Its embedded in the same exe/dll which means that's its impossible to separate IL and Metadata.
  • It has many other benefits like:
    • Header and library files are not needed anymore as it contains all type information. Compilers can read metadata directly from managed modules.
    • It allows IntelliSense in Visual studio to work.
    • CLRs code verification process ensures that the code is "safe" using metadata.
    • It helps in recreating objects state across processes or machines as the object's fields can be serialized and de-serialized in memory and sent across.
    • It allows garbage collector to track the life time of objects because of the type and object information it contains.

Understanding Managed Modules

  • The common language runtime (CLR) is just what its name says it is; a runtime that is usable by different and varied programming languages. The features of the CLR are available to any and all programming languages that target it—period.
  • In fact, at runtime, the CLR has no idea which programming language the developer used for the source code. This means that you should choose whatever programming language allows you to express your intentions most easily.
  • When you compile the English like source code written in your preferred language, C# is most preferred I guess, the compiler does the syntax checking and source code analysis. It then produces a Managed Module .
  • A managed module is a standard 32-bit Microsoft Windows portable executable (PE32) file or a standard 64-bit Windows portable executable (PE32+) file that requires the CLR to execute.
  • Microsoft ships two command-line utilities, DumpBin.exe and CorFlags.exe, that you can use to examine the header information emitted in a managed module by the compiler.
  • The following are the parts of a managed module:
    • PE32 or PE32+ header
      • If the header uses the PE32 format, the file can run on a 32-bit or 64-bit version of Windows. If the header uses the PE32+ format, the file requires a 64-bit version of Windows to run.
      • For modules that contain only IL code, the bulk of the information in the PE32(+) header is ignored. For modules that contain native CPU code, this header contains information about the native CPU code.
    • CLR Header
      • Contains the information (interpreted by the CLR and utilities) that makes this a managed module.
      • The header includes the version of the CLR required, some flags, managed module's entry point method (Main method), and the location/size of the module's metadata, resources, strong name etc.
    • Metadata
      • Every managed module contains metadata tables.
      • There are two main types of tables:
        • tables that describe the types and members defined in your source code and
        • tables that describe the types and members referenced by your source code.
    • Intermediate Language (IL) code
      • It's the code the compiler produces as it compiled the source code. At run time, the CLR compiles the IL into native CPU instructions.

Zachman Framework

  • One of the popular frameworks originated at IBM and is called the Zachman Framework. The Zachman Framework predated the popularity of object orientation and took the perspective of separating data from process.
  • The Zachman Framework formed a matrix of architectural descriptions that are also organized in terms of levels.
    • There are five levels of description above the information system implementation.
    • They range from architectural planning done by individual programmers at the finest grain to the overall enterprise requirements from the investors' perspective of the information system.
    • In total, the Zachman Framework identifies thirty architectural specifications, which provide a complete description of the information system. In practice, no real-world project is capable of creating these thirty or more detailed plans and keeping them all in synchronization.
    • When the Zachman Framework is applied, systems architects partition the viewpoint into various categories and create architectural specifications that cover some or all of the different Zachman descriptions without having to create the large number of individual specification documents that the Zachman Framework spans.

The need for Software Architecture

To accommodate system complexities in the world of distributed processing, Architects have three new needs:

  1. First, architects need the ability to separate complex concerns, in particular to separate concerns about business-application functionality, from concerns about distributed-system complexity.
    1. Problems and challenges of distributed computing have nothing to do fundamentally with business-application functionality.
    2. In a typical application, 70% of application code is infrastructure. Some of this code is unique to the application even though it might not directly address business requirements. By separating concerns, developers can focus on the business functionality that is the true purpose of the information system.
  2. Software architects also need the ability to future-proof the information systems that they are planning.
    1. It is important to accommodate commercial technology evolution, which is known to be accelerating and provides substantial challenges for architects and developers.
    2. Future-proofing also requires the ability to adapt to new user requirements, since requirements do change frequently and account for a majority of system software costs over the life cycle.
    3. It is important to plan information systems to support the likely and inevitable changes that users will require in order to conduct business.
  3. A third need for software architects is the ability to increase the likelihood of system success.
    1. Corporate developers to date have had a very poor track record of creating successful systems.
    2. The software architect is responsible for planning systems with the maximum probability of delivering success and key benefits for the business.
    3. Through proper information technology planning, it is possible to increase the likelihood of system delivery on time and on budget.

     
     

Open Distributed Processing

  • Among the various architectural approaches, a useful international standard, the Reference Model for Open Distributed Processing (RM-ODP), defines what information systems architecture means [ISO 1996].
  • RM-ODP defines five essential viewpoints for modeling systems architecture:
    • Enterprise viewpoint
    • Information viewpoint
    • Computational viewpoint
    • Engineering viewpoint
    • Technology viewpoint
  • The five viewpoints provide a comprehensive model of a single information system, with each viewpoint being a perspective on a single information system. The set of viewpoints is not closed, so that additional viewpoints can be added as the needs arise.
    • Another of their purposes is to provide information descriptions that address the questions and needs of particular stakeholders in the system.
    • By standardizing five viewpoints, RM-ODP is claiming that these five stakeholder perspectives are sufficient for resolving both business functionality and distributed systems issues in the architecture and design of information systems.
  • The enterprise viewpoint of the RM-ODP takes the perspective of a business model.
    • Managers and end users in the business environment should be able to understand the enterprise models readily.
    • The enterprise viewpoint ensures that business needs are satisfied through the architecture and provides a description that enables validation of these assertions with the end users.
  • The information viewpoint defines the universe of discourse in the information system.
    • The perspective is similar to the design information generated by a database modeler.
    • The information viewpoint is a logical representation of the data and processes on data in the information system.
    • The information viewpoint is an object-oriented logical model of the information assets in the business and how these assets are processed and manipulated.
  • The computational viewpoint partitions the system into software components that are capable of supporting distribution.
    • It takes the perspective of a designer of application program interfaces for component-ware.
    • The computational viewpoint defines the boundaries between the software elements in the information system.
    • Generally, these boundaries are the architectural controls that ensure that the system structure will embody the qualities of adaptability in management of complexity that are appropriate to meet changing business needs and incorporate the evolving commercial technology.
  • The engineering viewpoint of RM-ODP exposes the distributed nature of the system.
    • Its perspective is similar to that of an operating system engineer who is familiar with the protocol stacks and allocation issues necessary to define the distributed processing solutions for the information system.
  • The technology viewpoint defines the mappings between the engineering objects and other objects designed to specific standards and technologies including product selections.
    • The viewpoint is similar to that of a network engineer who is familiar with the commercially available protocol standards and products that are appropriate selections to configure the information system.

       
       

    The RM-ODP viewpoints enable the separation of concerns that divide the business and logical functionality of the system from the distributed computing and commercial technology decisions of the architecture.