bionfriend.blogg.se - Postico redshift

#Postico redshift update#

This took about 1.5 seconds (on my completely un-optimized postgres instance)

#Postico redshift update#

UPDATE test SET num = (CASE WHEN id = 1 THEN 123 CASE WHEN id = 2 THEN 356. I ran two queries to update the first 1000 'num' columns with random values. I filled it with 1000000 rows in sequential order (just a generate_series()), so the first row was (1,1), the second was (2,2) etc. I made a postgres table with two columns, id and num. So I was actually wondering if I was correct after seeing some comments here. Keep in mind, I also said: "If performance becomes a problem on this query, you can fix it later when you can focus on just that one issue and understand the specific problem much better." For example, the example you provided here is not even valid SQL. It shouldn't be your first choice, is all I'm saying. My point is that the complexity of writing the statement is non-trivial, and unless you have a good reason to you shouldn't use CASE. Yes, you can certainly find instances where the performance is better, but it may not be much. My point is that, in a vacuum, you can't really say anything about what the performance difference will be. With databases, the answer for almost everything is "it depends." Here, it really depends on how big the table is, how many rows you're updating, what the indexes are, how much RAM is available, etc. It can be a lot faster, but it can also be slower. It all depends on exactly what you're doing with the table. If you're only updating a portion of the table, or if the number of rows that you'll actually be updating is comparatively small, then the list of single UPDATE statements can perform much better. However, the list of single UPDATE statement can perform much better than a monolithic statement. However, if the table is not billions of rows, it can probably fit that index (or even the row data for small tables) in memory and will cache hit on everything. What's expensive here is that the database server has to scan the index or row data on member_status.

If you're not returning data and you reuse the connection like you're supposed to, "round trips" cost is essentially nothing. > You can imagine how many round trips this would take to the server if multiple individual UPDATE statements had been run. If performance becomes a problem on this query, you can fix it later when you can focus on just that one issue and understand the specific problem much better. Let the query planner and optimizer do the work. In general, however, make your queries difficult for the server and easy for you, because you make a ton more mistakes than the server ever will. UPDATE reward_members SET member_status = 'silver_group' WHERE member_status = 'silver' UPDATE reward_members SET member_status = 'platinum_group' WHERE member_status = 'platinum' UPDATE reward_members SET member_status = 'bronze_group' WHERE member_status = 'bronze' UPDATE reward_members SET member_status = 'gold_group' WHERE member_status = 'gold' If you really need an atomic change you can just do this: It gets extremely difficult extremely quickly to tell where you have an error, and it's very, very easy to make a very costly mistake. Yes, it's really neat to update everything in a single statement, and in some situations it can perform significantly better, but CASE expressions in an UPDATE statement quickly get complicated to map out in your head. This is premature optimization of the kind you want to avoid. Oh, man, as a seasoned DBA/Data Analyst, don't do this unless you have a really good reason to.

UPDATE multiple rows with a CASE expression