Introduction

Terraform uses API (Application Programming Interface) to interact with your Cloud Provider’s platform. Hence many error messages thrown by your Terraform deployment come directly from the cloud platform (i.e. OCI services). In some cases, they prove to be very unhelpful and empty of insights, leaving you to wonder what is breaking your deployment. These are part of what we call API Errors. In my case, I spent weeks pulling my hair off to find what was really behind my API 404 Error.

Therefore, I decided to summarize what was documented and what happened.

Service API Errors

First, as it’s always better to lay down the basics, I would like to give a brief preview of the API error structure for OCI.

Service error messages returned by the OCI Terraform provider include the following information:

  • Error – The HTTP status and API error codes
  • Provider Version – The version of the OCI Terraform provider used to make the request
  • Service – The OCI service responding with the error
  • Error Message – Details regarding the error returned by the service
  • OPC Request ID – The request ID
  • Suggestion – Suggested next steps

For example, as shown in the official OCI documentation, the output is very similar to common REST API errors

Error: <http_code>-<api_error_code>
Provider version: <provider_version>, released on <release_date>. This provider is <n> updates behind to current. 
 Service: <service>
 Error Message: <error_message>
 OPC request ID: exampleuniqueID
 Suggestion: <next_steps> 

 

Commonly Returned Service Errors

This list is not exhaustive and includes only the error message and suggestion

  • “400-LimitExceeded” service limits exceeded for a resource 
Error: 400-LimitExceeded
Error Message: Fulfilling this request exceeds the Oracle-defined limit for this tenancy for this resource type. Suggestion: Request a service limit increase for this resource <service> 
  • “500-InternalError”  definitely means you’ll have to call your Oracle support friends 
Error: 500-InternalError
Error Message: Internal error occurred Suggestion: Please contact support for help with service <service>

 

When are API credentials checked? 

Terraform Core Workflow

Let’s review the Terraform core workflow and its actions for the sake of completeness.

Workflow Commands:

init          Prepare your working directory for other commands
validate   Check whether the configuration (modules, attribute names, and value types) is valid
plan         Show changes required by the current configuration
apply       Create or update infrastructure
destroy    Destroy previously-created infrastructure

Now the thing is, although init loads the cloud provider plugin, none of the first three commands will verify the API credentials for you. I bluntly assumed otherwise, which was clumsy but it’s never been very explicit in the doc.

Bottom line:
terraform apply is the only step where API credentials are checked, but that’s not all…      

 

What the heck is 404-NotAuthorizedOrNotFound  

What I see

  • As you can see, the suggestion was nowhere close to helping me out with this conundrum.

  • My configuration ought to deploy a DB system stack & all previous commands (init, validate, plan) were successful.

What OCI API Errors Page says

I found this note: “ Verify the user account is part of a group with the appropriate permissions to perform the actions in the plan you are executing“

However, I could still create resources in the console with the same user without any permission issue.

After struggling with it for several days, I then decided to put it on ice for a few weeks.


 
The Real Root Cause 

2 weeks later, I accidentally found where all this mess came from, and the problem was right under my nose this whole time. I just decided to check my terraform.tfvars and each of OCI authentication variables one by one.

That’s where I noticed that my credentials were mixed up between 2 tenancies

# Oracle Cloud Infrastructure Authentication
tenancy_ocid     = "ocid1.tenancy.oc1.."      # TENANCY 1
user_ocid        = "ocid1.user.oc1.."         # USER FROM TENANCY 2
fingerprint      = "1c:"                      # TENANCY 1 
private_key_path = "~/oci_api_key.pem"        # TENANCY 1
public_key_path  = "~/oci_api_key_public.pem" # TENANCY 1
compartment_ocid = "ocid1.compartment.oc1."   # COMPARTMENT 2  
  • I agree that’s pretty screwed up, but this can happen when working with different tenancies from our workstations.   
  • Once the discrepancy was corrected the terraform stack was deployed as expected since the plan was successfully run. 

 

Conclusion

  • It’s very easy to make mistakes when deploying using terraform, especially when working with different tenancies.
  • We also noticed how API errors may just not be the most helpful insight to understand the root cause of such issues.
  • Bottom line is, always verify your credentials along with relying on source versioning platforms where credentials can be safely saved for each of your environments/repos ( GitLabOCI DevOps, Terraform Cloud).
  • Additionally, the above platforms allow CI/CD pipelines to automate your testing when your workload/team starts to grow.
  • Hope this post will help those who encounter the same issue and spare them days of unnecessary troubleshooting.