Odd Transport Issue: Mail Stuck in internal queues


A week or so ago I started to notice messages getting stuck in the queues on one of my Exchange 2013 servers. My troubleshooting included restarting services, applying the latest CU, statically setting IPv6 address, and looking though logs but I was unable to find the issue. What I did see in the logs, at the end of this post, were connection rejected and DNS errors that looked related to IPv6. So in an effort to see if IPv6 was really the issue I setup static IPv6 addresses on the Exchange servers, but that didn’t help. After exhausting many other things and getting tired of copying the mail.que files from one server to another to get the messages delivered, I finally called PSS\Microsoft Support.

PSS started out by checking basic name resolution, which was working. Then check IP and DNS settings on the NICs, where were fine. Then they checked for static DNS server settings on the transport services, I didn’t think of that and should have!

This is where they found the problem. Somehow the IPv6 address for the DNS servers was set on the backend transport service on my IZSRVEX01 server, the one were the queues were backing up. Messages both to be delivered to the Internet, via O365, and to internal mailboxes, even if on the local server, were getting stuck.

Here are the cmdlets PSS ran to find these settings:

Get-TransportService | ft  name, *DNSAdapterGuid
Get-FrontendTransportService | ft  name, *DNSAdapterGuid

On IZSRVEX01 the InternalDNSAdapterGuid value was set to something other than all zeros. So PSS cleared the values with this cmdlet:

Get-TransportService | ? {$_.Name -NotLike "*EDGE*"} | Set-TransportService -InternalDNSAdapterGuid 00000000-0000-0000-0000-000000000000 -ExternalDNSAdapterGuid 00000000-0000-0000-0000-000000000000

I’m not sure how these got set, the best wild guess I can make it that it got set somehow when moving VMs between Hyper-V servers. When doing this I’ve seen virtual NICs get lost and have had to reconfigure them, but still not sure why this would cause transport service to have a static IPv6 address set.

What makes troubleshooting this difficult is that starting with Exchange 2013 there is the Front-End (FE) transport services, which the *- FrontendTransportService cmdlets apply to and the Back-End (BE) transport service, which the *-TransportService cmdlets apply to. By default, the BE transport services do not have logging enable also. After I enabled logging, which I normally enable via a Transport configuring script, I did find which log had errors this should have led me to checking the DNS settings on the transport services, but I missed that.

Errors found using cmdlets

Using the [Get-Queue] cmdlet:

Identity    MessageCount NextHopDomain           Status LastError

——–    ———— ————-           —— ———

IZSRVEX01\4          115 edgesync – home to o365  Retry 451 4.4.0 DNS query failed. The error was: DNS query failed with error ErrorRetry

IZSRVEX01\5          497 mailboxes                Retry 451 4.4.0 DNS query failed. The error was: DNS query failed with error ErrorRetry

You can see above that message, that messages to both the EDGE server, which then delivers to Office 365, and to the “mailboxes” database were suck.

Below are further signs of the issues and log entries.

Using the [Get-Queue -Identity IZSRVEX01\5 | FL]:

RunspaceId            : 42ba65c4-de75-4a73-81c3-8c97f9a5a314
DeliveryType          : SmtpDeliveryToMailbox
NextHopDomain         : mailboxes
TlsDomain             :
NextHopConnector      : 500b24dd-bda7-49e5-816d-5e9ea8d9360b
Status                : Retry
MessageCount          : 1
LastError             : 451 4.4.0 DNS query failed. The error was: DNS query failed with error ErrorRetry

 

From BE connectivity transport, default path: C:\Exchange Server\V15\TransportRoles\Logs\Hub\Connectivity\CONNECTLOG<date>.LOG

2016-01-18T18:56:56.592Z,08D32039214A4CB4,SMTP,edgesync – home to o365,>,DNS server returned ErrorRetry reported by 255.255.255.255. [Domain:Result] = IZSRVEDGE01.altered.com:ErrorRetry; IZSRVEDGE02.altered.com:ErrorRetry;

2016-01-18T18:56:56.592Z,08D32039214A4CB4,SMTP,edgesync – home to o365,-,Messages: 0 Bytes: 0 (The DNS query for  ‘SmtpRelayWithinAdSiteToEdge’:’edgesync – home to o365′:’54fa82f8-4b9d-49fe-acbd-2f968f11a3cd’ failed with error : ErrorRetry)

From C:\Program Files\Microsoft\Exchange Server\V15\TransportRoles\Logs\Hub\ProtocolLog\SmtpSend

2016-01-21T18:40:50.468Z,Intra-Organization SMTP Send Connector,08D322903B262F35,1,,[da8:6c3:ce53:a890::42]:2525,*,,”Failed to connect. Winsock error code: 10061, Win32 error code: 10061, Error Message: No connection could be made because the target machine actively refused it [da8:6c3:ce53:a890::42]:2525″

About Jason Sherry

I am a ~30 year Exchange consultant and expert. I currently work for Commvault as a Solutions Specialist for Microsoft Infrastructure For more info see my resume at: http://resume.jasonsherry.org
This entry was posted in Exchange, Technical and tagged . Bookmark the permalink.

3 Responses to Odd Transport Issue: Mail Stuck in internal queues

  1. Martin C Wells says:

    Holy CRAP THANK YOU!!! Saved my bacon!

    Liked by 1 person

  2. Keith AF says:

    I love you so much

    Liked by 1 person

  3. Ritchie says:

    Thank you so much, please keep sharing!

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s