1

I need to calculate PayPal fees or an online shop to show an overview table with the fees per day. I came up with a solution playing this over several views, but it's dead slow. I takes more than a seven minutes for around 4200 original rows of order data with Paypal fees to calculate.

Hence, I am looking for a faster solution, but this is over my head.

Paypal fees have a fix base price per transaction and a percentage on top, which is tiered. The tier is selected by the sum of turnaround payed with PayPal in the last month (I assume 30 days).

This is the tiered fee information:

  • tier 0 0 - 2.500 €: 3,4% + 0,35 €
  • tier 1 2.501 - 10.000 €: 2,9% + 0,35 €
  • tier 2 10.001 - 50.000 €: 2,7% + 0,35 €
  • tier 3 50.001 - 100.000 €: 2,4% + 0,35 €
  • tier 4 100.001 - unlimited €: 1,9% + 0,35 €

What I came up so far are a couple of views based on order data imported from the shop:

-- values are all multiplied by 100 to have int values -- calculate the fees on every order. all possible tiers. CREATE VIEW public.shopify_created_paypal_tiered_fees AS SELECT order_id, created_at::DATE as date total_price, round(0.35 + customer_paid * 0.034, 0) as tier_0_fee, round(0.35 + customer_paid * 0.029, 0) as tier_1_fee, round(0.35 + customer_paid * 0.027, 0) as tier_2_fee, round(0.35 + customer_paid * 0.024, 0) as tier_3_fee, round(0.35 + customer_paid * 0.019, 0) as tier_4_fee FROM public.shopify_order_financial_data WHERE payment_gateway_names @> ARRAY['paypal']::text[]; -- group and sum them up by day CREATE VIEW public.shopify_created_paypal_tiered_fees_by_day AS SELECT created_at_date, sum(total_price) as total_price_sum, sum(tier_0_fee) as tier_0_fee_sum, sum(tier_1_fee) as tier_1_fee_sum, sum(tier_2_fee) as tier_2_fee_sum, sum(tier_3_fee) as tier_3_fee_sum, sum(tier_4_fee) as tier_4_fee_sum FROM public.shopify_created_paypal_tiered_fees GROUP BY created_at_date ORDER BY created_at_date desc; -- calculate the rolling sum of the last 30 days per day. CREATE VIEW public.shopify_created_pp_turnaround_of_last_30_days_by_day AS SELECT created_at_date, total_price_sum, (SELECT sum(total_price_sum) FROM public.shopify_created_paypal_tiered_fees_by_day WHERE created_at_date BETWEEN SYMMETRIC oft.created_at_date::DATE AND (oft.created_at_date::DATE - INTERVAL '30 DAY')::DATE) as pp_turnaround_of_last_30_days FROM public.shopify_created_paypal_tiered_fees_by_day as oft; -- based on rolling sum 30 days before order's created_at_date use the correct fee and add info or which one was used. CREATE VIEW public.shopify_created_paypal_active_tier_fees AS SELECT pptfbd.created_at_date, pptolm.total_price_sum, pp_turnaround_of_last_30_days, CASE WHEN pp_turnaround_of_last_30_days BETWEEN 0 AND 250000 THEN tier_0_fee WHEN pp_turnaround_of_last_30_days BETWEEN 250000 AND 1000000 THEN tier_1_fee WHEN pp_turnaround_of_last_30_days BETWEEN 1000000 AND 5000000 THEN tier_2_fee WHEN pp_turnaround_of_last_30_days BETWEEN 5000000 AND 10000000 THEN tier_3_fee ELSE tier_4_fee END as pp_fee_for_day, CASE WHEN pp_turnaround_of_last_30_days BETWEEN 0 AND 250000 THEN 'tier 0' WHEN pp_turnaround_of_last_30_days BETWEEN 250000 AND 1000000 THEN 'tier 1' WHEN pp_turnaround_of_last_30_days BETWEEN 1000000 AND 5000000 THEN 'tier 2' WHEN pp_turnaround_of_last_30_days BETWEEN 5000000 AND 10000000 THEN 'tier 3' ELSE 'tier 4' END as pp_tier_for_day, CASE WHEN pp_turnaround_of_last_30_days BETWEEN 0 AND 250000 THEN '3,4% + 0,35 €' WHEN pp_turnaround_of_last_30_days BETWEEN 250000 AND 1000000 THEN '2,9% + 0,35 €' WHEN pp_turnaround_of_last_30_days BETWEEN 1000000 AND 5000000 THEN '2,7% + 0,35 €' WHEN pp_turnaround_of_last_30_days BETWEEN 5000000 AND 10000000 THEN '2,4% + 0,35 €' ELSE '1,9% + 0,35 €' END as pp_fee_formula_for_day, tier_0_fee, tier_1_fee, tier_2_fee, tier_3_fee, tier_4_fee FROM public.shopify_created_paypal_tiered_fees pptfbd INNER JOIN public.shopify_created_pp_turnaround_of_last_30_days_by_day pptolm ON pptfbd.created_at_date = pptolm.created_at_date; 

Any suggestions for speeding this up? Thanks!

3
  • 1
    Did you try a window function instead? Commented May 31, 2019 at 11:46
  • @a_horse_with_no_name I tried to wrap my head around them, but it was not bright enough for that today... Commented May 31, 2019 at 20:40
  • Something like: sum(total_price_sum) over (order by created_at_date range between '30 day' preceding and current row) as pp_turnaround_of_last_30_days Commented Jun 1, 2019 at 4:13

1 Answer 1

1

The issue you are running into is because you are creating nested views which can cause postgress to lose track of indexes and not use them properly. This is something that I have noticed before when you get multiple layers deep the optimizer will no longer be using the indexes. A good way to check this would be do an explain on each view.

What you can do is use a CTE(Common Table Expression) in order to do all the calculations in a single view. This will buy you two things.

  1. The optimizer will be able to hit all indexes
  2. You will be able to limit the data you are looking at to the last 30 days so the database will have less items to work at which will help performance.

https://www.postgresql.org/docs/current/queries-with.html

2
  • Currently going over the explain plans. Damn, they look ugly! :D Thanks for the tip! Commented May 31, 2019 at 19:31
  • I joined together everything as a single query and now it performes below a second! Great improvement :) Commented May 31, 2019 at 20:38

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.