A while back, I worked on a project that required persisting data to multiple databases. The requirement was to save some data to a Microsoft SQL Server (which hosted the billing application data) and save a different set of data to an Oracle database (which housed the database for the flagship Online Transaction Processing [OLTP] system). Both of the database servers were protected by the almighty firewall, valiantly protecting the valuable database servers from a variety of virulent violations.
Ailing attempts at alliteration aside, here’s a high-level diagram describing the application I’m talking about:
While we were testing the application in the Integration Testing environment, we quickly noticed that the persistence operations were failing. Read operations were working fine, but that’s because we were reading our data from yet another database (the account/CRM database, omitted from the diagram above for simplicity). The Windows System Administration group reported seeing errors related to the Application Event Log on the Application Server. The errors in the Application Event Log pointed to the Microsoft Distributed Transaction Coordinator (MS DTC or just plain ol’ DTC).
But everything worked so well in the Development environment! An environment where there’s no firewa — oh. Crap. Crap on a crap cracker.
Looking over the Windows System Admin’s shoulder, we saw that the Distributed Transaction Coordinator was being enlisted during a save operation. Watching the DTC statistics, we noticed that the transactions were aborting. Okay, that’s a good start, which led to this conversation:
SA: Why are the DTC transactions aborting – are they being blocked by the firewall?
DEV: That seems pretty likely, especially considering we didn’t see this happen in Development.
SA: What port should we open?
DEV: Well, the DTC uses RPC to communicate. According to Microsoft, RPC uses port 135. Can you telnet to the database server from the application server over port 135?
SA: Wait a sec… nope doesn’t look like it. I’ll get that port opened and we can try again.
[Windows SAs work with the network admins to get port 135 open on the almighty firewall. Testing resumes. Abysmally.]
SA: Port 135 has been opened, but the DTC is reporting that transactions are still being aborted. What now?
DEV: What now? Umm… to the batcave! I need to do some homework to figure out what’s going on.
SA: Wait, one more question – why is the MS DTC involved anyway?
DEV: Well, our application has to do a 2-phase commit. We have to save data to SQL Server and Oracle. If a save to one database fails, we want to roll everything back. We’re using the System.Transactions namespace for our transaction management. Since we’re opening multiple database connections, System.Transactions is escalating our operation to the DTC, which handles the whole thing. It’s actually pretty –
SA: Okay, okay. I get it. Stop talking and go fix it… nerd.
What’s Going On
Whenever the DTC steps in and starts managing transactions, it does a whole lot of work for you. Here are a couple of the highlights (the highlights we care about, anyway):
- Uses RPC, which communicates over port 135.
- As part of the normal MSRPC protocol, MSDTC chooses a dynamic port between 1024 and 65535.
- Aha – lightbulb moment! This is why our transactions are still being aborted. We’ll see how to address this in the How can I Limit the Port Range on My Servers? section.
Before attempting any of this, have a conversation with your SA team and network admin team. Make sure they’re okay with everything being suggested below. The solution to this issue isn’t code related – it’s all server configuration/management. In a way, you – the developer – are acting as the DTC on your application’s behalf, making sure that the SA and network admins are coordinating and committing their changes together.
Here’s the short version of what to look for:
- Make sure DTC is configured properly (i.e. can accept connections).
- Make sure your application and database servers can communicate through the firewall over port 135.
- Limit the port range DTC can use when communicating via RPC.
- Open those ports (identified and configured in Step 3) bi-directionally on the firewall.
- Use DTCPing to verify/troubleshoot.
Is MS DTC Configured Properly?
WARNING: I know I said it before, but it really bears repeating: Work with your SA team before making any of these changes.
Here are a couple of things you can look for on your application and database servers:
- Check the MS DTC settings (you can get to this from the control panel) on the application server AND the database server(s).
- The following MS DTC settings should be turned on/checked:
- Allow network access
- Allow remote administration (not required, but advisable for testing/debugging)
- This setting should be shut off in Production. Safety first, kids.
- Allow inbound connections
- Allow outbound connections
- Make sure you can telnet from your application server to the database server (and vice versa).
- use the following command: telnet [the name of your server] 135
- If you see a blank screen with a blinking cursor, then telnet worked. Port 135 is open, and your servers can communicate over it
- NOTE: On Windows Server 2008, telnet is not installed by default. You may have to install this on the server. You can follow this guide to install a telnet client
- Install and run DTCPing on the application and database servers.
- Consult this article on how to get DTCPing up and running
- If memory serves, there’s a warning about DTCPing not being supported on/by Windows Server 2008
- While troubleshooting this issue, we used DTCPing on at least one Windows 2008 Server and it appeared to work properly. Your mileage may vary
Monitoring MS DTC Statistics
To monitor activity, you can watch the Statistics window and have someone conduct a test from the affected application.
- If you see the Transaction is completed, then the MS DTC can communicate between the application and database servers through the firewall and coordinate the transaction.
- If you see the Transaction is aborted, then there is some sort of communication issue between the servers (e.g. a firewall is blocking the port).
So, the MS DTC is configured properly and port 135 is open on your firewall. The DTC statistics are still reporting that the transactions are aborting. That’s likely because the DTC is picking a port between 1024 and 65535, which is blocked by the firewall. With such a huge range of available ports, how will the almighty firewall be able to do its job?
Follow the steps below to limit the port range the DTC has to choose from.
WARNING: The steps below involve changes to the server’s registry and restarting the entire machine, so make sure these steps are conducted during your maintenance window.
- Work with your SA team and network admin team to identify an acceptable port range for RPC to use.
- Microsoft recommends:
- Opening ports from 5000 and up
- Opening a minimum of 15 – 20 ports
- Microsoft recommends:
- Configure the DCOM Port Range restriction on the application server:
- Update the registry (HKEY_LOCAL_MACHINE\Software\Microsoft\Rpc) to use the ports identified
- Refer to this article for details on how to specify the port range
- Configure the DCOM Port Range restriction on the database server
- Update the registry (HKEY_LOCAL_MACHINE\Software\Microsoft\Rpc) to use the ports identified.
- Refer to this article for details on how to specify the port range
- Restart the application server
- Restart database server(s)
- Open up the same range of ports (i.e. the range identified in Step 1) on the firewall bi-directionally
After our SA team and network admin team made the changes cited in the How can I Limit the Port Range on My Servers? section above, our application started working! We saw our transactions were committed in the DTC statistics window. We also saw that our records were persisted in their respective databases.
For being commonly described as a “black box that you shouldn’t have to worry about”, there sure are a lot of configuration options for the MS DTC. Hopefully this article has helped de-mystify the Microsoft Distributed Transaction Coordinator. Furthermore, I hope this blog entry will serve as a basic guide for troubleshooting those pesky DTC communication issues.
I found these articles invaluable to solving our DTC/firewall issue:
- Troubleshooting MS DTC Issues With DTCPing
- Configuring DTC to Work Through a Firewall
- DTC Statistics Window Contents
- How to Troubleshoot MS DTC Firewall Issues
- Mictrosoft Distributed Transaction Coordinator
Leave a comment
- iOS Unit Testing With OCMock
- Why Stakeholders Need To Be Involved In Scrum
- NuGet Config File Transformation Causes Duplicate Entries On Update
- Load Testing with Locust on Windows
- Writing A Custom LINQ Provider With Re-linq
- AutoMapper Profile Organization
- Rails 3.2: A Nested-Form Demo Part 4: Switch to Targeting Computer!
- SharpRepository: Configuration
- Rails 3.2: A Nested-Form Demo, Part 3: We’re Starting Our Attack Run!
- Rails 3.2: A Nested-Form Demo, Part 2: Accelerate to Attack Speed!
- Rails 3.2: A Nested-Form Demo, Part 1: All Wings Report In!
- iOS Behind the Curve
- Distributed Transaction Coordinators, Port 135, and Firewalls – Oh My!
- SharpRepository: Getting Started
- Find Performance Problems Using JMeter, MySQL and Xdebug/Webgrind
- Taming Hot Key Context Shifting When Running A Windows VM In Virtualbox On OSX
- Integrating Twitter’s Bootstrap Into Your Project
- Mobile payments, tags and more using NFC
- Stress Pig
- Dear Client Services, What Works?
- What Would Steve Do?
- Still Using Fiddler to Test & Debug Your REST Services?
- Write-through and Generational Caching Make a Great Team
- Thinking Recursively
- Development Incentives, What’s the Payoff?
- How do you like them Apples?
- “Optional” Software Development Practices Series — Code Review
- Adding Images to Select Lists in MVC3
- “Optional” Software Development Practices Series
- You Get What You Pay For…
- Outsourcing Safety Tips
- Facebook IPO
- The Ballad of Tim Toady
- The Little Schemer
- Newsflash: Mom leaves tech job at 5p.m.
- I <negative_emotion> Windows 8!
- Prefix vs. Postfix Increment and Decrement Operators in C++
- Corporate videos: viral boon or epic fail?
- Recruitin’ Time!
- Reference vs. pointer parameters in C++
- The IE8 "hover" Bug: The Most Awesome IE Bug Ever?
- When is perfect perfect enough?
- SOPA/PIPA: Anti-Censorship Protest or Techies Revenge?
- A Decade of Fairway
- Handling Session Timeout Gracefully
- Generating Software Diagrams
- The Audacity of Nope
- The Origins of Culture
- Scrum Overview in Prezi – not another boring slideshow
- Numbers don’t lie: LinkedIn Statistics
- What is your favorite software development tool?
- Best Practices for Selecting Onshore, Nearshore or Offshore Information Technology Outsourcing (ITO) Providers
- Sign of the Times
- Advantages and Risks of Offshoring, Nearshoring or Onshoring
- Does Outsourcing Mean Offshoring?
- Too little, too late?
- New Favorite Lunch Spot
- Why should I care about functions as first-class citizens?
- PHP Remote Debugging with XDebug and NetBeans
- Installing SubText with Web PI
- ROI Primer
- Learn Domain-Driven Design
- Learn Behavior-Driven Development
- Mario Kart Tournament
- F# in 90 Seconds
- Website Vulnerabilities
- Scrum Overview
- Language Club
- Top 12 Favorite Podcasts Ever…
- Fairway Dart Tournament
- Learn Lean Software Development and Kanban Systems
- Android – Eclipse Quick Start
- Learn Functional Programming
- Backup & Restore Strategy
- Smartphone Screens – Another Wireless Variable
- Wireless Application Market
- Head First AOP